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Title: Streptococcus suis vaccines and diagnostic tests. 

The invention relates to Streptococcus suis infections of 
pigs, to vaccines directed against those infections and to 
tests for diagnosing Streptococcus suis infections • 

Streptococcus suis is an important cause of meningitis, 
5 septicemia, arthritis and sudden death in young pigs (4, 4 6) . 
Incidentally, it can also cause meningitis in man (1). S.suis 
strains are usually identified and classified by their 
morphological, biochemical and serological characteristics 
(58, 59'-, 46) . Serological classification is based on the 

10 presence of specific antigenic polysaccharides. So far, 35 
different serotypes have been described (9, 56, 14) . In 
several European countries, S. suis serotype 2 is the most 
prevalent type isolated from diseased pigs, followed by 
serotypes 9 and 1. Serological typing of S. suis is carried 

15 out using different types of agglutination tests. In these 

tests, isolated and biochemically characterised S. suis cells 
are agglutinated with a panel of 35 specific sera. These 
methods are very laborious and time-consuming. 

Little is known about the pathogenesis of the disease caused 

20 by S. suis, let alone about its various serotypes such as type 
2. Various bacterial components, such as extracellular and 
cell-membrane associated proteins, fimbriae, haemaglutinins, 
and haemolysin have been suggested as virulence factors (9, 10, 
11, 15, 16, 47, 49) . However, the precise role of these protein 

25 components in the pathogenesis of the disease remains unclear 
(37). It is well known that the polysaccharidic capsule of 
various Streptococci and other gram-positive bacteria plays an 
important role in pathogenesis (3, 6, 35, 51, 52) . The capsule 
enables these micro-organisms to resist phagocytosis and is 

30 therefore regarded as an important virulence factor. Recently, 
a role of the capsule of S. suis in the pathogenesis was 
suggested as well (5) . However, the structure, organisation and 



functioning of the genes responsible for capsule polysaccharide 
synthesis (cps) in S. suis is unknown. Within S, suis serotypes 
1 and 2 strains can differ in virulence for pigs (41, 45, 49) . 
Some type 1 and 2 strains are virulent, other strains are not. 
5 Because both virulent and non-virulent strains of serotype 1 
and 2 strains are fully encapsulated, it may even be that 
capsule is not a relevant factor required for virulence. 

Attempts to control S. suis infections or disease are still 
hampered by the lack of knowledge about the epidemiology of the 
10 disease and the lack of effective vaccines and sensitive 
diagnostics . 

The invention provides an isolated or recombinant nucleic 
acid encoding a capsular (cps) gene cluster of StrBptococcus 

15 suis^ Biosynthesis of capsule polysaccharides in general has 
been studied in a number of Gram-positive and Gram-negative 
bacteria (32) , In Gram-negative bacteria, but also in a number 
of gram-positive bacteria, genes which are involved in the 
biosynthesis of polysaccharides are clustered at a single 

20 locus. Streptococcus suis capsular genes as provided by the 
invention show a common genetic organisation involving three 
distinct regions. The central region is serotype specific and 
encodes enzymes responsible for the synthesis and 
polymerisation of the polysaccharides. This region is flanked 

25 by two regions conserved in Streptococcus suis which encode 
proteins for common functions such as transport of the 
polysaccharide across the cellular membrane. However, in 
between species, only low homologies exist, hampering easy 
comparison and detection of seemingly- similar genes. Knowing 

30 the nucleic acid encoding the flanking regions allows type- 
specific determination of nucleic acid of the central region of 
Streptococcus suis serotypes, as for example described in the 
experimental part of the description of the invention. 

The invention provides an isolated or recombinant nucleic 

35 acid encoding a capsular gene cluster of Streptococcus suis or 
a gene or gene fragment derived thereof. Such a nucleic acid 
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is for example provided by hybridising chromosomal DNA derived 
from any one of the Streptococcus suis serotypes to a nucleic 
acid encoding a gene derived from a Streptococcus suis 
serotype 1, 2 or 9 capsular gene cluster, as provided by the 
5 invention (see for example Tables 4 and 5) and cloning of 
(type-specific) genes as for example described in the 
experimental part of the description. At least 14 open reading 
frames are identified. Most of the genes belong to a single 
transcriptional unit, identifying a co-ordinate control of 

10 these genes, they, and the enzymes and proteins they encode, 
act in concert to provide the capsule with the relevant 
polysaccharides. The invention provides cps genes and proteins 
encoded" thereof involved in regulation (CpsA) , chain length 
determination (CpsB, C) , export (CpsC) and biosynthesis (CpsE, 

15 F, G, H, -J, K) . Although the overall organisation seemed at 
first glance to be similar to that of the cps and eps gene 
clusters of a number of Gram-positive bacteria (19, 32, 42), 
overall homologies are low (see table 3) • The region involved 
in biosynthesis is located at the centre of the gene cluster 

20 and is flanked by two regions containing genes with more 
common functions. 

The invention provides an isolated or recombinant nucleic 
acid encoding a capsular gene cluster of Streptococcus suis 
serotype 2 or a gene or gene fragment derived thereof, 

25 preferably as identified in Figure 3. Genes in this gene 
cluster are involved in polysaccharide biosynthesis of 
capsular components and antigens. For a further description of 
such genes see for example Table 2 of the description, for 
example a cpsA gene is provided functionally encoding 

30 regulation of capsular polysaccharide synthesis, whereas cpsB 
and cpsC are functionally involved in chain in chain length 
determination. Other genes, such as cpsD, E, F, G, H, I, J, K 
and related genes, are involved in polysaccharide syntheses, 
functioning for example as glucosyl- or glycosyltransf erase . 

35 The cpsF, G, H, I, J genes encode more type-specific proteins 
than the flanking genes which are found more-or-less conserved 
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throughout the species and can serve as base for selection of 
primers or probes in PCR-amplif ication or cross-hybridisation 
experiments for subsequent cloning. 

5 For example, the invention further provides ;an isolated or 
recombinant nucleic acid encoding a capsular gene cluster of 
Streptococcus suis serotype 1 or a gene or gene fragment 
derived thereof, preferably as identified in Figure 4. 
In addition, the invention provides an isolated or 

10 recombinant nucleic acid encoding a capsular gene cluster of 
Streptococcus suis serotype 9 or a gene or gene fragment 
derived thereof , preferably as identified in Figure 5. 

Furthermore, the invention provides for example a fragment or 
parts thereof of the cps locus, involved in the capsular 

15 polysaccharide biosynthesis, of S. sulSr exemplified in the 
experimental part for serotype 1, 2 or 9, and allows easy 
identification or detection of related fragments derived of 
other serotype of S- suis. 

The invention provides a nucleic acid probe or primer 

20 derived from a nucleic acid according to the invention 
allowing species or serotype specific detection of 
Streptococcus suis. Such a probe or primer (herein used 
interchangeably) is for example a DNA, RNA or PNA (peptide 
nucleic acid) probe hybridising with capsular nucleic acid as 

25 provided by the invention. Species specific detection is 

provided preferably by selecting a probe or primer sequence 
from a species-specific region (e.g. flanking region) whereas 
serotype specific detection is provided preferably by 
selecting a probe or primer sequence from a type-specific 

30 region (e.g. central region) of a capsular gene cluster as 

provided by the invention. Such a probe or primer can be used 
in a further unmodified form, for example in cross- 
hybridisation or polymerase-chain reaction (PGR) experiments 
as for example described in the experimental part of the 

35 description of the invention. Herein the invention provides 
the isolation and molecular characterisation of additional 
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type specific cps genes of S. suis types 1 and 9. In addition, 
we describe the genetic diversity of the cps loci of serotypes 
1, 2 and 9 among the 35 S. suis serotypes yet known. Type- 
specific probes are identified. Also^ a type-specific PGR for 
5 for example serotype 9 is provided, -being a rapid, reliable 
and sensitive assay, which is used directly on nasal or 
tonsillar swabs or other samples of infected or carrier 
animals . 

The invention also provides a probe or primer according to 

10 the invention further provided with at least one reporter 
molecule. Examples of reporter molecules are manifold and 
known in the art, for example a reporter molecule can comprise 
additional nucleic acid provided with a specific sequence 
(e.g» oligo-dT) hybridising to a corresponding sequence to 

15 which hybridisation can easily be detected for example because 
it has been immobilised to a solid support. 

Yet other reporter molecules comprise chromophores, e.g. 
f luorochromes for visual detection, for example by light 
microscopy or fluorescent in situ hybridisation (FISH) 

20 techniques, or comprise an enzyme such as horseradish 

peroxidase for enzymatic detection, e.g in enzyme-linked 
assays (EIA) . Yet other reporter molecules comprise 
radioactive compounds for detection in radiation-based-assays. 
In a preferred embodiment of the invention, at least one 

25 probe or primer according to the invention is provided 

(labelled) with a reporter molecule and a quencher molecule, 
providing together with unlabeled probe or primer a PCR-based 
test allowing rapid detection of specific hybridisation. 

The invention further provides a diagnostic test or test kit 

30 comprising a probe or primer as provided by the invention. 
Such a test or test kit, for example a cross-hybridisation 
test or PCR-based test, is advantageously used in rapid 
detection and/or serotyping of Streptococcus suis. 
The invention furthermore provides a protein or fragment 

35 thereof encoded by a nucleic acid according to the invention. 
Examples of such a protein or fragment are for example 
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proteins described in for example Table 2 of the description, 
for example a cpsA protein is provided functionally encoding 
regulation of capsular polysaccharide synthesis, whereas cpsB 
and cpsC are functionally involved in chain in chain length 
5 determination. Other proteins or functional fragments thereof 
as provided by the invention, such as cpsD, E, F, G, H, I, J, 
K and related proteins, are involved in polysaccharide 
biosynthesis, functioning for example as glucosyl- or 
glycosyltransf erase in polysaccharide biosynthesis of 

10 Streptococcus suis capsular antigen. 

The invention furthermore provides a method to produce a 
Streptococcus suis capsular antigen comprising using a protein 
or functional fragment thereof as provided by the invention, 
and provides therewith a Streptococcus suis capsular antigen 

15 obtainable by such a method. A comparison of the predicted 

amino acid sequences of the cps2 genes with sequences found in 
the databases allowed the assignment of functions to the open 
reading frames. The central region contains the type specific 
glycosyltransferases and the putative polysaccharide 

20 polymerase. This region is flanked by two regions encoding for 
proteins with common functions, such as regulation and 
transport of polysaccharide across the membrane. 
Biosynthesis of Streptococcus capsular polysaccharide antigen 
using a protein or functional fragment thereof is 

25 advantageously used in chemo-enzymatic synthesis and the 
development of vaccines which offer protection against 
serotype-specif ic Streptococcal disease, and is also 
advantageously used in the synthesis and development of 
multivalent vaccines against Streptococcal infections. Such 

30 vaccines elicit anticapsular antibodies which confer 
protection. 

The invention furthermore provides a vaccine comprising an 
antigen according to the invention and further comprising a 
suitable carrier or adjuvant. The iimnunogenicity of a capsular 
35 antigen provided by the invention is for example increased by 
linking to a carrier (such as a carrier protein) , allowing the 
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recruitment of T-cell help in developing an immune response. 

The invention further provides a recombinant micro- 
organism provided with at least a part of a capsular gene 
cluster derived from Streptococcus suis*' The invention 
5 provides for example a lactic acid bacterium provided with at 
least a part of a capsular gene cluster derived from 
Streptococcus suis. Various food-grade lactic acid bacteria 
(Lactococcus lactis, Lactobacillus casein Lactobacillus 
plantarium and Streptococcus gordonii) have been used as 

10 delivery systems for mucosal immunization. It has now been 

shown that oral (or mucosal) administration of recombinant L. 
lactis, Lactobacillus, and Streptococcus gordonii can elicit 
local I.gA and /or IgG antibody responses to an expressed 
antigen. The use of oral routes for immunization against 

15 infective diseases is desirable because oral vaccines are 
easier to administer, have higher compliance rates, and 
because mucosal surfaces are the portals of entry for many 
pathogenic microbial agents. It is within the skill of the 
artisan to provide such micro-organisms with (additional) 

20 genes. 

The invention further provides a recombinant 
Streptococcus suis mutant provided with a modified capsular 
gene cluster. It is within the skill of the artisan to swap 
genes within a species. In a preferred embodiment, an 

25 avirulent Streptococcus suis mutant is selected to be provided 
with at least a part of a modified capsular gene cluster 
according to the invention. 

The invention further provides a vaccine comprising a micro- 
organism or a mutant provided by the invention. An advantage 

30 of such a vaccine over currently used vaccines is that they 
comprise accurately defined micro-organisms and well- 
characterised antigens, allowing accurate determination of 
immune responses against various antigens of choice. 

The invention is further explained in the experimental part 

35 of this description without limiting the invention thereto. 
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Experixaen^bal part: 

lAXERXAIi AMD METHODS 

5 Bacterial stjraxns and growth conditi.on]i : - ' t::''^'::'^^' -:^'^'^ -]' ''' 
The bacterial strains and plasmids used in this study are 
listed in Table 1. S. suis strains were grown in Todd-Hewitt 
broth (code Oxoid) ^ and plated on Coiumbia agar blood 

base' (code CM331, Oxoid) containing 6% (v/v) horse blood, 

10 E.coll strains were grown in Luria broth (28) and plated on 
Luria broth containing 1.5% (w/v) agar. If required, 
antibiotics were added to the plates at the following 
concentrations: spectinomycin: 100 ug/ml for S./suis and 50 
ug/ml for E. coll and ampicillin, 50 ug/ml. 

15 Serotyplng. The S.suls strains were serotypes by the slide 
agglutination test with serotype-specif ic antibodies (44) , 
DNA techniques. Routine DNA manipulations were performed as 
described by Sambrook et al. (36). 

Alkaline phosphatase activity. To screen for PhoA fusions in 
20 E.coll, plasmid libraries were constructed. Therefore, 

chromosomal DNA of S. suis type 2 was digested with AIuI. The 

300-500-bp fragments were ligated to Smal-digested pPH0S2. 

Ligation mixtures were transformed to the PhoA" E. coll strain 

ecus. Transformants were plated on LB media supplemented with 
25 5-Bromo-4-chloro-3-indolylfosfaat (BCIP, 50 ug/ml, Boehringer, 

Mannheim, Germany) * Blue colonies were purified on fresh 

LB/BCIP plates to verify the blue phenotype. 

DNA sequence analysis. DNA sequences were determined on a 373A 
DNA Sequencing System (Applied Biosystems, Warrington, GB) . 

30 Samples were prepared by use of a ABl/PRISM dye terminator 
cycle sequencing ready reaction kit (Applied Biosystems) . 
Sequencing data were assembled and analyzed using the 
MacMollyTetra program. Custom-made sequencing primers were 
purchased from Life Technologies. Hydrophobic stretches within 

35 proteins were predicted by the method of Klein et al. (17) . The 
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BLAST program available on Netscape Navigator^M used to 

search for protein sequences related to the deduced amino acid 
sequences . 

Cons 'true t^xon of gene-specxf xc knock-ou^ mutants of S. su±s. To 

5 construct the mutant strains lOcpsB and lOcpsEF we 

electrotransf ormed the pathogenic serotype 2 strain 10 
(45, 49) of S. suis with pCPSll and pCPS28 respectively. In 
these plasmids the cpsB and cpsEF genes were disturbed by the 
insertion of a spectinomycin-resistance gene. To create pCPSll 

10 the internal 4 00 bp Pstl-BamHl fragment of the cpsB gene in 

pCPS7 was replaced by the Spc^ gene. For this purpose pCPS7 was 
digested "with PstI and BamHI and ligated to the 1,200-bp Pstl- 
BamHI fragment, containing the Spc^ gen, from pIC-spc. To 
construct pCPS28 we have used pIC20R. In this plasmid we 

15 inserted the JKpnl-Sail fragment from pCPS17 (resulting in 

pCPS25) and the Xbal-Clal fragment from pCPS20 (resulting in 
pCPS27) . pCPS27 was digested with PstI and Xhol and ligated to 
the 1,200-bp Pstl-Xhol fragment, containing the Spc^ gene^ of 
pIC-spc* The electrotransf ormation to S. suis was carried out 

20 as described before (38) . 

Southern blotting and hybridization. Chromosomal DNA was 
isolated as described by Sambrook et al. (36) . DNA fragments 
were separated on 0.8% agarose gels and transferred to Zeta- 
Probe GT membranes (Bio-Rad) as described by Sambrook et aJ. 

25 (36). DNA probes were labelled with [( -32p]dCTP (3000 Ci 
mmol~l; Amersham) by use of a random primed labelling kit 
(Boehringer) • The DNA on the blots was hybridized at 65^0 with 
appropriate DNA probes as recommended by the supplier of the 
Zeta-Probe membranes. After hybridization^ the membranes were 

30 washed twice with a solution of 40 mM sodium phosphate, pH 7.2, 
1 mM EDTA , 5% SDS for 30 min at 65^0 and twice with a solution 
of 40 mM sodium phosphate, pH 7.2, 1 mM EDTA, 1% SDS for 30 min 
at 650c. 

PCR. The primers used in the cps2J PGR correspond to the 
35 positions 13791-13813 and 14465-14443 in the S. suis cps2 

locus. The sequences were: 5' -CAAACGCAAGGAATTACGGTATC-3' and 
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5'-GAGTATCTAAAGAATGCCTATTG-3' • The primers used for the cpsll 
PGR correspond to the positions 4398-4417 and 4839-4821 in the 
S. suls cpsl sequence. The sequences were: 5'- 
GGCGGTCTAGCAGATGCTCG-3' and 5' -GCGAACTGTTAGCAATGAC-3'' . The 
5 primers used in the cps9H PGR correspond to the positions 
4406-4126 and 4494-4475 in the suls cps9 sequence. The 
sequences were: 5' -GGCTACATATAATGGAAGCCC3' and 5'- 
CGGAAGTATCTGGGCTACTG-3' • 

Elec^sron Mlcsroscopy. Bacteria were prepared for electron 
10 microscopy as described by Wagenaar et al. (50). Shortly, 

bacteria .were mixed with agarose MP (Boehringer) of 31^ C to a 
concentration of 0.7%. The mixture was immediately cooled on 
ice. Upon gelifying, samples were cut into 1 to 1 . 5 mm slices 
and incubated in a fixative containing 0.8% glutaraldehyde and 
15 0-8% osmiumtetraoxide . Subsequently, the samples were fixed 
and stained with uranyl acetate by microwave stimulation, 
dehydrated and imbedded in eponaraldite resin. Ultra-thin 
sections were counterstained with lead citrate and examined 
with a Philips CM 10 electron microscope at 80 kV. 
20 l3olaLt;xon of porcine alveolsLr macrophages (AM) . Porcine AM were 
obtained from the lungs of specific pathogen free (SPF) pigs. 
Lung lavage samples were collected as described by van Leengoed 
et al, (43) . Cells were suspended in EMEM containing 6% (v/v) 
SPF-pig serum and adjusted to 10"^ cells per ml. 

25 

Iden^±£i. cation o£ the cps locus. 

The first part of the cps locus of S.suls type 2 was identified 
30 by making use of a strategy developed for the genetic 

identification of exported proteins (13, 31) , In this system we 
made use of a plasmid (pPH0S2) containing a truncated alkaline 
phosphatase gene (13) . The gene lacked the promoter sequence, 
the translational start site and the signal sequence. The 
35 truncated gene is precede by a unique Smal restriction site. 
Chromosomal DNA of S. suls type 2, digested with AIuI, was 



randomly cloned in this restriction site. Because translocation 
of PhoA across the cytoplasmic membrane of E. coll is required 
for enzymatic activity^ the system can be used to select for S. 
suls fragments containing a promoter sequence, a translational 
5 start site and a functional signal sequence* Among 560 

individual E. coll clones tested, 16 displayed a dark blue 
phenotype when plated on media containing BCIP. DNA sequence 
analysis of the inserts from several of these plasmids were 
performed (results not shown) and the deduced amino acid 

10 sequences were analyzed. The hydrophobicity profile of one of 
the clones (pPHOS7, results not shown) showed that the N- 
terminal 'part of the sequence resembled the characteristics of 
a typical signal peptide: a short hydrophilic N-terminal region 
is followed by a hydrophobic region of 38 amino acids. These 

15 data indicate that the phoA system was successfully used for 
the selection of S. suls genes encoding exported proteins. 
Moreover, the sequences were analyzed for similarities present 
in the databases. The sequence of pPHOS7 showed a high 
similarity (37% identity) with the protein encoded by the 

20 cpsl4C gene of Streptococcus pneumoniae (19) . This strongly 
suggests that pPHOS7 contains a part of the cps operon of S. 
suls type 2. 

Cloning of the £laj:ikxng cps genes . In order to clone the 
flanking cps genes of S. suls type 2 the insert of pPHOS7 was 

25 used as a probe to identify chromosomal DNA fragments which 
certain flanking cps genes. A 6-kb Hindlll fragment was 
identified and cloned in pKUN19. This yielded clone pCPS6 (Fig. 
IC) . Sequence analysis of the insert of pCPS6 revealed that 
pCPS6 most probably contained the 5 ' -end of the cps locus, but 

30 still lacked the 3 ' -end (see below). Therefore, sequences of 
the 3' -end of pCPSS were in turn used as a probe to identify 
chromosomal fragments containing cps sequences located further 
downstream. These fragments were also cloned in pKUN19, 
resulting in pCPS17. Using the same system of chromosomal 

35 walking we subsequently generated the plasmid pCPSlS, pCPS20, 
pCPS23 and pCPS26, containing downstream cps sequences. 
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Analysxs of the cps operon. The complete nucleotide sequence of 
the cloned fragments was determined (figure 4) . Examination of 
the compiled sequence revealed the presence of at least 13 
potential open reading frame (Orfs), which were designated as 
5 Orf 2Y, Orf2X and Cps2A-Cps2K (Fig. .lA) . Moreover; a -14th, 
incomplete, Orf (Orf 2Z) was located at the 5'-end of the 
sequence. Two potential promoter sequences were identified. One 
was located 313 bp (locations 1885-1865 and 1884-1889) 
upstream of Orf2X. The other potential promoter sequence was 

10 located 68 bp upstream of Orf2Y (locations 2241-2236 and 2216- 
2211) . Orf2Y is expressed in opposite orientation. Between Orfs 
2Y and 2Z the sequence contained a potential stem-loop 
structure, which could act as a transcription terminator. Each 
Orf is preceded by a ribosome-binding site and the majority of 

15 the Orfs 'are very closely linked. The only significant 
intergenic gap was found between Cps2G and Cps2H (389 
nucleotides) . However, no obvious promoter sequences or 
potential stem-loop structures were found in this region. These 
data suggest that Orf2X and Cps2A-Cps2K are arranged as an 

20 operon. 

An overview of all Orfs with their properties is shown in 
Table 2. The majority of the predicted gene products is related 
to proteins involved in polysaccharide biosynthesis. Orf2Z 
showed some similarity with the YitS protein of Bacillus 
25 subtilis, YitS was identified during the sequence analysis of 

the complete genome of B. subtilis , The function of the protein 
is unknown. 

Orf2Y showed similarity with YcxD protein of B. subtilis 
(53) . Based on the similarity between YcxD and MocR of 
30 Rhizobium melilotl (33) , YcxD was suggested to be a regulatory 
protein. 

Orf2X showed similarity with the hypothetical YAAA proteins 
of Haemophilus influenzae and E. coli. The function of these 
proteins is unknown. 
35 The gene products encoded by the cps2Ar cps2B, cps2C and 

cps2D genes showed approximate similarity with the CpsA, CpsC, 
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CpsD and CpsB proteins of several serotypes of Streptococcus 
pneumoniae (19) ,/ respectively . This suggest similar functions 
for these proteins. Hence Cps2A may have a role in the 
regulation of the capsular polysaccharide synthesis. Cps2B and 
5 Cps2C could be involved in the chain length detennination of 
the type 2 capsule and Cps2C can play an additional role in the 
export of the polysaccharide. The Cps2D protein of 5, suls is 
related to the CpsB protein of S, pneumoniae and to proteins 
encoded by genes of several other Gram-positive bacteria 

10 involved in polysaccharide or exopolysaccharide synthesis, but 
their function is unknown (19) . 

The protein encoded by cps2E gene showed similarity to 
several bacterial proteins with glycosyl transferase 
activities: Cpsl4E and Cpsl9fE of S. pneumoniae serotypes 14 

15 and 19F (18, 19, 29), CpsE of Streptococcus salvarius (X94980) 
and CpsD of Streptococcus agalactiae (34) . Recently, Kolkman et 
al. (18) showed that Cpsl4E is a glucosyl-l-phosphate 
transferase that links glucose to a lipid carrier, the first 
step in the biosynthesis of the S. pneumoniae type 14 repeating 

20 unit. Based on these data a similar function may be fulfilled 
by Cps2E of S. suis . 

The protein encoded by the cps2F gene showed similarity to 
the protein encoded by the rfhU gene of Salmonella enteritica. 
(25) . This similarity is most pronounced in the C-terminal 

25 regions of these proteins. The rfbU gene was shown to encoded 
mannosyltransferase activity (25) . 

The cps2G gene encoded a protein that showed moderate 
similarity with the rfhF gene product of Campylobacter hyoilei 
(22) , the epsF gene product of S. thermophilus (40) and the 

30 capM gene product of S. aureus (24) • On the basis of 

similarity the rfbF, epsF and capM genes are suggested to 
encoded galactosyltransf erase activities. Hence, a similar 
glycosyl transferase activity could be fulfilled by the cps2G 
gene product. 

35 The cps2H gene encodes a protein that is similar to the N- 
terminal region of the IgtD gene product of Haemophilus 
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influenzae (U32768) . Moreover, the hydrophobicity plots of 
Cps2H and LgtD looked very similar in these regions (data not 
shown) . Based on sequence similarity the IgtD gene product was 
suggested to have glycosyl transferase activity (U32768) . 
5 The gene product encoded by the cps2I gene showed some 
similarity with a protein of Actinobaclllus 

actinSmycetemcomitans (AB002668) , This protein is part of the 
gene cluster responsible for the serotype-b-specif ic antigen of 
A. actimycetemcomitans . The function of the protein is unknown. 

10 The gene products encoded by the cps2J and cps2K genes 

showed significant similarities to the Cpsl4J protein of S, 
pneumohiao. The cpsl4J gene of 5. pneumoniae was shown to 
encode a B-1, 4-galactosyltransferase activity. In 5. 
pneumoniae CpsJ is responsible for the addition of the fourth 

15 (i.e. last) sugar in the synthesis of the S. pneumoniae 

serotype 14 polysaccharide (20) . Even some similarity was 
found between Cps2J and Cps2K (Fig. 2, 25.5% similarity). This 
similarity was most pronounced in the N-terminal regions of the 
proteins. Recently, two small conserved regions were identified 

20 in the N-terminus of Cpsl4J and Cpsl4I and their homologues 
(20) . These regions were predicted to be important for 
catalytic activity. Both regions, DXS and DXDD (Fig. 2), were 
also found in Cps2J and Cps2K, 

25 Dxst:ril>u'tlon of tlie cps2 genes in otiier sxixs sero-types. To 

examine the relationship between the cps2 genes and cps genes 
in the other S. suis serotypes, we performed cross- 
hybridization experiments. DNA fragments of the individual 
cps2 genes were amplified by PGR, labelled with ^^P, and used 

30 to probe Southern blots of chromosomal DNA of the reference 

strains of the 35 different S. suis serotypes. Large variation 
in the hybridization patterns were observed (Table 4 ) . As a 
positive control we used a probe specific for 16S rRNA, The 
16S rRNA probe hybridized with all serotypes tested. However, 

35 none of the other genes tested were common in all serotypes. 
Based on the genetic organization of the genes we previously 
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suggested that orfX and cpsA-cpsK genes are part of one operon 
and that the protein encoded by these genes are all involved 
in polysaccharide biosynthesis* OrfY and OrfZ are. not a part 
of this operon, and their role in the polysaccharide 
5 biosynthesis is unclear. Based on sequence similarity data, 
OrfY may be involved in regulation of the cps2 genes* OrfZ is 
proposed to be unrelated to polysaccharide biosynthesis. 
Probes specific for the orfZr orfY, orfX^ cpsA^ cpsB^ cpsC and 
cpsD genes hybridized with most other serotypes- This suggests 

10 that the protein encoded by these genes are not type-specific, 
but may perform more common functions in biosynthesis of the 
capsular* polysaccharide. This confirms previous data which 
showed -that the cps2A-cps2D genes showed strong similarity to 
cps genes of several serotype of Streptococcus pneumoniae. 

15 Based on this similarity Cps2A is possibly a regulatory 

protein, whereas Cps2B and Cps2C may play a role in length 
determination and export of polysaccharide. The cps2E gene 
hybridized with DNA of serotypes 1, 2, 14 and 1/2. The cps2E 
gene showed a strong similarity to the cpsl4E gene of S. 

20 pneL27noniae (18) . This enzyme was shown to have a glucosyl-1- 
phosphate activity and catalyzed the transfer of glucose to a 
lipid carrier (18) . These data indicate that a 
glycosyltransf erase closely related to Cpsl4E may be 
responsible for the first step in the biosynthesis of 

25 polysaccharide in the S. suis serotypes 1, 2, 14 and 1/2. The 
cps2F, cps2G, cps2Hr cps2I and cps2J genes hybridized with 
chromosomal DNA of serotypes 2 and 1/2 only. The cps2G gene 
showed an additional weak hybridization signal with DNA of 
serotype 34. In agglutination tests serotype 1/2 showed 

30 agglutination with sera specific for serotype 2 as well as 

with sera specific for serotype 1. This suggests that serotype 
1/2 shares antigenic determinants with both types 1 and 2. The 
hybridization data confirmed these data. All putative 
glycosyltransf erases present in serotype 2 are also present in 

35 serotype 1/2. The cps2K gene showed a similar hybridization 

pattern as the cps2E gene. Hybridization was observed with DNA 
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of serotypes 1^ 2, 14 and 1/2. Taken together these 
hybridization data show that the cps2 gene cluster can be 
divided in three regions: a central region containing the 
type-specific genes is flanked by two regions containing 
5 common genes for various serotypes. - V • - 

Cloning o£ -tbe 'bype- specific cpa genes of sero^jfpes 1 and 9. 

To clone the type-specific cps genes of S. suis serotype 1 we 
used the cps2E gene as a probe to identify chromosomal DNA 

10 fragments of type 1 which contain flanking cps genes, A 5 kb 
EcoRV fragment was identified and cloned in pKUN19. This 
yielded pCPSl-1 (Fig. IB) . This fragment was in turn used as a 
probe to identify an overlapping 2.2 kb Hindlll fragment. 
pKUN19 containing this Hindlll fragment was designated pCPSl- 

15 2. The same strategy was followed to identify and clone the 
type-specific cps genes of serotype 9. In this case, we used 
the cps2D gene as a probe. A 0 . 8 kb ifindlll-Xbal fragment was 
identified and cloned, yielding pCPS9-l (Fig. IC) . This 
fragment was in turn used as a probe to identify a 4 kb Xhal 

20 fragment. pKUN19 containing this 4 kb Xbal fragment was 
designated pCPS9-2. 

Analysis of the cloned cpsl genes. The complete nucleotide 
sequence of the inserts of pCPSl-1 and pCPSl-2 was determined 

25 (figure 5) . Examination of the sequence revealed the presence 
of five complete and two incomplete Orfs (Fig. IB). Each Orf 
is preceded by a ribosome-binding site. In accord with data 
obtained for the cps2 genes of serotype 2^ the majority of the 
Orfs is very closely linked. The only significant gap (718 bp) 

30 was found between CpslG and CpslH, No obvious promoter 

sequences or potential stem-loop structures could be found in 
this region. This suggests that, as in serotype 2, the cps 
genes in serotype 1 are arranged in an operon. 

An overview of the Orfs and their properties in shown in 

35 Table 2. As expected on the basis of the hybridization data 
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(Table 4), the protein encoded by the cpslE gene was related 
to Cps2E of S. suis type 2 (identity of 86%) . The fragment 
cloned in pCPSl-1 lacked the coding region for the first 7 
amino acids of the cpslE gene. 
5 The protein encoded by the cpslF and cpslG genes showed 

strong similarity to the Cpsl4F and Cpsl4G proteins of 
Streptocoacus priBumonia^ serotype 14 respectively (20) . The 
function of the Cpsl4F is not completely clear, but it has 
been suggested that Cpsl4F can enhance role in 

10 glycosyltransf erase activity. The cpsl4G gene of S. pnemnoniae 
was shown to encode fi*l, 4-galactosyltransf erase activity. In 
S. pneumoniae type 14 this activity is recjuired for the second 
step in the biosynthesis of the oligosaccharide subunit (20) . 
Based on the similarity data found similar glycosyltransf erase 

15 and enhancing activities are suggested for the cps IG and 
cpslF genes of S. suis type 1. 

The protein encoded by the cpslH gene showed similarity to 
the Cpsl4H protein of S, pneumoniae (20) . Based on sequence^ 
similarity Cpsl4H was proposed to be the polysaccharide 

20 polymerase (20) . 

The protein encoded by the cpsll gene sh::v7ed some 
similarity with the Cpsl4J protein of S. pneumoniae (19) . The 
cpsl4J gene was shown to encode a B-1, 4-galactosyltransf erase 
activity, responsible for the addition of the fourth (i.e. 

25 last) sugar in the synthesis of the S. pneumoniae serotype 14 
polysaccharide . 

Between CpslG and CpslH a gap of 718 bp was found. This 
region revealed three small Orfs. The three Orfs were 
expressed in three different reading frames and were not 

30 preceded by potential ribosome binding sites, nor contained 
potential start sites. However, the three potential gene 
products encoded by this region showed some similarity with 
three successive regions of the C-terminal part of the EpsK 
protein of Streptococcus thermophilus (27% identity, 40) . The 

35 region related to the first 82 amino acids is lacking. 
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Analysis of the cloned cpsB genes. We also determined the 
complete nucleotide sequence of the inserts of pCPS9-l and 
pCPS9-2 (figure 6) . Examination of the sequence revealed the 
5 presence of three complete and two incomplete Orfs (Fig.lC)* 
As in serotypes 1 and 2, all Orfs are preceded by a ribosome- 
binding site and are very closely coupled. As suggested by the 
hybridization data (Table 4) the Cps2D and Cps9D proteins were 
highly related (Table 2) . Based on sequence comparisons pCPS9- 

10 1 lacked the first 27 amino acids of the Cps9D protein. 
The protein encoded by the cps9E gene showed some 
similarity with the CapD protein of Staphylococcus aureus 
serotype 1 (24) . Based on sequence similarity data the CaplD 
protein was suggested to be an epimerase or a dehydratase 

15 involved in the synthesis of N-acetylf ructosamine or N- 
acetylgalactosamine (53) . 

Cps9F showed some similarity to the CapM proteins of S. 
aureus serotypes 5 and 8 (61, 64, 65). Based on sequence 
similarity data Cap5M and Cap8M are proposed to be 

20 glycosyltransf erases (63). 

The protein encoded by the cps9G gene showed some 
similarity with a protein of Actinojbacilius 

actinomycBtomcomitans (AB002668_4) . This protein is part of a 

gene cluster responsible for the serotype-b specific antigens 
25 of Actinobacillus actinomycetemcomitans • The function of the 

protein is unknown. 

The protein encoded by the cps9H gene showed some 

similarity with the rfbB gene of Yersinia enterolitica (68) . 

The RfbB protein was shown to be essential for O-antigen 
30 synthesis, but the function of the protein in the synthesis 

of the 0:3 lipopolysaccharide is unknown. 

Serot:ype 1 cmd sero'bype 9 specific cps genes . To determine 
whether the cloned fragments in pCPSl-1, pCPSl-2, pCPS9-l and 
35 pCPS9-2 contained the type-specific genes for serotype 1 and 
9, respectively, cross hybridization experiments were 
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performed. DNA fragments of the individual cpsl and cps9 genes 
were amplified by PCR^ labelled with ^^P, and used to probe 
Southern blots of chromosomal DNA of the reference strains of 
the 35 different S. .suis serotypes. The results are shown in 
5 Table 5. Based on the data obtained with the cps2E probe 
(Table 4), the cpslE probe was expected to hybridize with 
chromosomal DNA of S. suis serotypes 1^2, 14, 27 and 1/2. The 
cpslH, cps9E and cps9F probes hybridized with most other 
serotypes. However, the cpslF and cpslG and cpsll probes 

10 hybridized with chromosomal DNA of serotypes 1 and 14 only. 
The cps9G and cps9H probe hybridized with serotype 9- only. 
These data suggest that the cps9G and cps9H probes are 
specific for serotype 9 and therefore could be useful tools 
for the development of rapid and sensitive diagnostic tests 

15 for S. suis type 9 infections. 

Type specific PGR. So far, the probes were tested on the 35 
different reference strains only. To test the diagnostic value 
of the type-specific cps probes further, several other S. suis 

20 serotype 1, 2, 1/2, 9 and 14 strains were used. Moreover, 
since a PGR based method would be even more rapid and 
sensitive than a hybridization test, we tested whether we 
could use a PGR for the serotyping of the S. suis strains. The 
oligonucleotide primer sets were chosen within the cps2J, 

25 cpsll and cps9H genes. Amplified fragments of 675 bp, 380 bp 
and 390 bp were expected respectively. The results show that 
675 bp fragments were amplified on type 2 and 1/2 strains 
using cps2J primers; 380 bp fragments were amplified on type 1 
and 14 strains using cpsll primers and 390 bp fragments were 

30 amplified on type 9 strains using cps9H primers. 

DISCUSSION 

We describe the identification and the molecular 
35 characterisation of the cps locus, involved in the capsular 
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polysaccharide biosynthesis, of S. suis serotype 2. 
A region of 16 kb was cloned and sequenced. 14 open reading 
frames were identified. Most of the genes seemed to belong to a 
single transcriptional unit, suggesting a co-ordinate control 
5 of these genes. We assign functions to most of the gene 

products. We thereby identified regions involved in regulation 
(Cps2A) , chain length determination {Cps2B, C) , export (Cps2C) 
and biosynthesis (Cps2E, F, G, H, J, K) . The region involved in 
biosynthesis is located at the centre of the gene cluster and 

10 is flanked by two regions containing genes with more common 

functions. The incomplete orJf2Z gene was located at the 5 ' -end 
of the cloned fragment. Orf2Z showed some similarity with the 
YitS protein of B. suhtilxs. However, because the function of 
the YitS protein is unknown this did not give us any 

15 information about the possible function of Orf2Z. Because the 
orf2Z gene is not a part of the cps operon, a role of this gene 
in polysaccharide biosynthesis is not expected. The Orf2Y 
protein showed some similarity with the YcxD protein of 
B.subtills (53). The YcxD protein was suggested to be a 

20 regulatory protein. Similarly, Orf2Y may be involved in the 
regulation of polysaccharide biosynthesis. The Orf2X protein 
showed similarity with the YAAA proteins of H. Influenzae and 
E. coii. The function of . these proteins is unknown. In S. suis 
type 2 the orf2X gene seemed to be the first gene in the cps2 

25 operon. This suggests a role of Orf2X in the polysaccharide 
biosynthesis. In H. Influenzae and E. coli^ however, these 
proteins are not associated with capsular gene clusters. The 
analysis of isogenic mutants impaired in the expression of 
Orf2X should give more insight in the presumed role of Orf2X in 

30 the polysaccharide biosynthesis of S. suis type 2. 

The gene products encoded by the cps2Er cps2Fjr cps2Gj. cps2H, 
cps2J and cps2K genes showed little similarity with 
glycosyltransferases of several Gram-positive or Gram-negative 
bacteria (18, 19, 20, 22, 25). The cps2E gene product shows 

35 some similarity with the Cpsl4E protein of S. pneumoniae (18, 
19) . Cpsl4E is a glucosyl-l-phosphate transferase that links 
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glucose to a lipid carrier (18) . In S. pneumoniae this is the 
first step in the biosynthesis of the oligosaccharide repeating 
unit. The structure of the S. suis serotype 2 capsule contains 
glucose, galactose, rhamnose, N-acetyl glucoseamine and sialic 
5 acid in a ratio of 3:1:1:1:1 (7). Based on these data we 
conclude that Cps2E of S. suis has glucosyltransf erase 
activity, and is involved in the linkage of the first sugar to 
the lipid carrier. 

The C-terminal region of the cps2F gene product showed some 

10 similarity with the RfbU of Salmonella enteritica. RfbU was 
shown to have mannosyltransf erase activity (24) . Because 
mannosyl is not a component of the S. suis type 2 
polysaccharide a mannosyltransf erase activity is not expected 
in this organism. Nevertheless, cps2F encodes a 

15 glycosyltransf erase with another sugar specificity. 

Cps2G showed moderate similarity to a family of gene 
products suggested to encode galactosyltransf erase activities 
(22, 24, 40) . Hence a similar activity is shown for Cps2G. 
Cps2H showed some similarity with LgtD of influenzae 

20 (U32768) . Because LgtD was proposed to have glycosyltransf erase 
activity , a similar activity is fulfilled by Cps2H. 

Cps2J and Cps2K showed similarity to Cpsl4J of S. pneumoniae 
(20) . Cps2J showed similarity with Cpsl4I of S. pneumoniae as 
well. Cpsl4I was shown to have N-acetyl glucosaminyltransf erase 

25 activity, whereas Cpsl4J has a Jl-1, 4-galactosyltransf erase 

activity (20) . In 5. pneumoniae Cpsl4I is responsible for the 
addition of the third sugar and Cpsl4J for the addition of the 
last sugar in the synthesis of the type 14 repeating unit 
(20) . Because the capsule of 5. suis type 2 contains galactose 

30 as well as N-acetyl glucosamine components, 
galactosyltransf erase as well as N-acetyl 

glucoaminyltransf erase activities could be envisaged for the 
cps2J and cps2K gene products, respectively. As was observed 
for Cpsl4I and Cpsl4J, the N-termini of Cps2J and Cps2K showed 
35 a significant degree of sequence similarity. Within the N- 

terminal domains of Cpsl4I and Cpsl4J, two small regions were 
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identified, which were also conserved in several other 
glycosyltransf erases (22) . Within these two regions, two Asp 
residues were proposed to be important for catalytic activity. 
The two conserved regions, DXS and DXDD, were also found in 
5 Cps2J and Cps2K. 

The function of Cps2I remains unclear • Cps2I showed some 
similarity with a protein of A. actinomycetemcomitans , Although 
this protein part is of the gene cluster responsible for the 
serotype-B-specif ic antigens, the function of the protein is 
10 unknown. 

We further describe the identification and characterization of 
the cps genes specific for S. suis serotypes 1, 2 and 9. After 
the entire cps2 locus of S. suis serotype 2 was cloned and • 

15 characte^-ized, functions for most of the cps2 gene products 

could be assigned by sequence homologies. Based on these data 
the glycosyltransf erase activities, required for type 
specificity, could be located in the centre of the operon. 
Cross-hybridization experiments, using the individual cps2 

20 genes as probes on chromosomal DNAs of the 35 different 

serotypes, confirmed this idea. The regions containing the 
type-specific genes of serotypes 1 and 9 could be cloned and 
characterized, showing that an identical genetic organization 
of the cps operons of other S. suis serotypes exists. The 

25 cpslEr cpslFr cpslGr cpslH^ and cpsll genes revealed a 

striking similarity with cpsl4 E, cpsl4F^ cpsl4G^ cpsl4H and 
cpsl4J genes of S. pneumoniae. Interestingly, S. pneumoniae 
serotype 14 is the serotype most commonly associated with 
pneumococcal infections in young children (54) , whereas S. 

30 suis serotype 1 strains are most commonly isolated from 
piglets younger than 8 weeks (46) - In S. pneumoniae the 
cpsl4Er cpsl4G^ cpsl4I and cpsl4J encode the 

glycosyltransf erases required for the synthesis of the type 14 
tetrameric repeating unit, showing that the cpslE, cpslG and 
35 cpsll genes encoded glycosyltransf erases . The precise 
functions of these genes as well as the substrate 
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specificities of the enzymes can be established. In S. 
pneumoniae the cpsl4E gene was shown to encode a glucosyl-1- 
phosphate transferase catalyzing the transfer of glucose to a 
lipid carrier. Moreover, cpsf;-like gen0s were^found in 5- 
5 pneuzDoniae serotypes 9N, 13, 14, 158, .iSb, 18.F, IjBA jand 19F 
(60) . CpsE mutants were constructed in the serotypes" 9N, 13 , 
14 and 15B. All mutant strains lacked glucosyltransf erase 
activity (60) . Moreover^ in all these S. pneumoniae serotypes 
the cpsE gene seemed to be responsible for the addition of 

10 glucose to the lipid carrier. Based on these data we suggest 
that in 5. suis type 1 the cpslE gene may fulfil a similar 
function I The structure of the S, suis type 1 capsule is 
unknowrt, but it is composed of glucose, galactose, N-acetyl 
glucosamine, N-acetyl galactosamine and sialic acid in a ratio 

15 of 1: 2.4: 1: 1:1»4 (5). Therefore a role of a cpsE-lxke 
glucosyltransf erase activity can easily be envisaged- CpsE 
like sequences were also found in serotypes 2, 1/2 and 14. 

For polysaccharide biosynthesis in S. pneumoniae type 14, 
transfer of the second sugar of the repeating unit to the 

20 first lipid-linked sugar is performed by the gene products of 
cpsl4F and cpsl4G (20) . Similar to Cpsl4F and Cpsl4G, the 5. 
suis type 1 proteins CpslF and CpslG may act as one 
glycosyltransf erase performing the same reaction. Cpsl4F and 
Cpsl4G of S. pneumoniae showed similarity to the N-terminal 

25 half and C-terminal half of the SpsK protein of Sphingomonas 
(20, 67), respectively. This suggests a combined function for 
both proteins. Moreover, cpsl4F and cpsl4G like sequences were 
found in several serotypes of S. pneumoniae and these genes 
always seemed to exist together (60) . The same was observed 

30 for S. suis type 1. The cpslF and cpslG probes hybridized 
with type 1 and type 14 strains. 

According to the similarity found between the cpslH gene and 
the cpsl4H gene of S. pneumoniae (20) , cpslH is expected to 
encode a polysaccharide polymerase. 

35 The protein encoded by the cpsll gene showed some 

similarity with the Cpsl4J protein of S. pneumoniae (19) . The 
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cpsl4J gene was shown to encode a 4-galactosyltransf erase 

activity, responsible for the addition of the fourth (i.e. 
last) sugar in the synthesis of the S. pneumoniae serotype 14 
polysaccharide • In S. suis type 2 the proteins encoded by . the 
5 cps2J and cps2K genes showed similaLirity to the Cpsl4J protein. 
However, no significant homologies were found .between' Cps2J, 
Cps2K and CpslI- In the N-terminal regions of Cpsl4J and 
Cpsl4I two small conserved regions, DXS and DXDD, were 
identified (19) . These regions seemed to be important for 

10 catalytic activity (13) , At the same positions in the sequence 
Cps2I contained the regions DXS and DXED. 

In the' region between CpslG and CpslH three small Orf s were 
identified. Since the Orfs were expressed in three different 
reading frames, and did not contain potential start sites, 

15 expression is not expected. However, the three potential gene 
products encoded by this region showed some similarity with 
three successive regions of the C-terminal part of the EpsK 
protein of Streptococaus thermophllus (27% identity, 40) . The 
region related to the first 82 amino acids is lacking. The 

20 EpsK protein was suggested to play a role in the export of the 
exopolysaccharide by rendering the polymerized 
exopolysaccharide more hydrophobic through a lipid 
modification. These data could suggest that the sequences in 
the region between CpslG and CpslH originated from epsK-like 

25 sequence- Hybridization experiments showed that this epsK-like 
region is also present in other serotype 1 strains as well as 
in serotype 14 strains (results not shown) . 

The function of most of the cloned serotype 9 genes can be 
established- Based on sequence similarity data the cps9E and 

30 cps9F genes could be glycosyltransf erases (61, 24, 63, 64, 

65) . Moreover, the cpsBG and cps9H genes showed similarity to 
genes located in regions involved in polysaccharide 
biosynthesis, but the function of these genes is unknown (68) . 
Cross-hybridization experiments using the individual cps2^ 

35 cpsl and cps9 genes as probes showed that the cps9G and cps9H 
probes specifically hybridized with serotype 9 strains. 
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Therefore, these are useful as tools for the identification of 
S. suis type 9 strains both for diagnostic purposes as well as 
in epidemiological and transmission studies. We previously- 
developed a PGR method which can be used to detect S* suls 
5 strains in nasal and tonsil swabs of pigs (62). The method was 
for example used to identify pathogenic (EF-positive) strains 
of S. suis serotype 2 During the last years, beside suis 
type 2 strains, serotype 9 strains are frequently isolated 
from organs of diseased pigs. However, until now a rapid and 

10 sensitive diagnostic test was not available for type 9 

strains^. Therefore, the type 9 specific probes or the type 9 
specific PGR is of great diagnostic value. The cpslF, cpslG 
and cpsll probes hybridized with serotype 1 as well as with 
serotype 14 strains. In coagglutination tests type 1 strains 

15 react with the anti-type 1 as well as with the anti-type 14 
antisera (56) . This suggests the presence of common epitopes 
between these serotypes. On the other hand type 1 strains 
agglutinated only with anti-type 1 serum (56,57), indicating 
that it is possible to detect differences between those 

20 serotypes. 

The cps2F^ cps2G^ cps2H^ cps2I and cps2u probes hybridized 
with serotypes 2 and 1/2 only. Serotype 34 showed a weak 
hybridizing signal with the cps2G probe. As shown in 
agglutination tests type 1/2 strains react with sera directed 

25 against type 1 as well as with sera directed against type 2 
strains (4 6) . Therefore, type 1/2 shared antigens with both 
types 1 and 2. Based on the hybridization patterns of serotype 
1/2 strains with the cpsl and cps2 specific genes, serotype 
1/2 seemed to be more closely related to type 2 strains than 

30 to type 1 strains. In our current studies we identify type- 
specific genes, primers or probes which are used for the 
discrimination of serotypes 1, 14 and 2 and 1/2 and others of 
the 35 serotypes yet Icnown. Furthermore, type-specific genes, 
primers or probes can now easily be developed for yet unknown 

35 serotypes, once they become isolated. 
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TABLE 1. 



Bacterial strains and plasoids 



strain/plasmid 



source/reference 



relevant 
characteristics 



10 



15 



20 



25 



30 



35 



40 



strain 

E,coli 
CC118 
XL2 blue 

E.coli 
XL2 blue 

5. suis 

10 

3 

17 

735 

T15 

6555 
6388 
6290 
5637 

5673 
5679 
5928 
5934 
5209 

5218 
5973 
6437 
6207 

reference strains 



PhoA" 

Stratagene 



Stratagene 



virulent serotype 2 strain 
serotype 2 
serotype 2 

reference strain serotype 2 
serotype 2 

reference strain serotype 1 
serotype 1 
serotype 1 
serotype 1 

serotype 1/2 
serotype 1/2 
serotype 1/2 
serotype 1/2 

reference strains serotype 1/2 

reference strain serotype 9 
serotype 9 
serotype 9 
serotype 9 

serotypes 1-34 



(28) 



(49) 
(63) 
(63) 
(63) 
(63) 

(63) 
(63) 
(63) 
(63) 

(63) 
(63) 
(63) 
(63) 
(63) 

(63) 
(63) 
(63) 
(63) 

(9, 56, 14) 
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5. suis 
10 

lOcpsB 
lOcps^F 



virulent serotype 2 strain 
isogenic cpsB mutant of strain 10 

isogenic cpsEF mutant of strain 10 



(51) 

this work 
this work 



50 



55 



60 



65 



70 



75 



80 



Plasmid 

pKUNl9 

pGEM7Zf {+) 

pIC19R 

pIC20R 

pIC-spc 

pDL282 

pPHOS2 

pPH07 

PPHOS7 

pCPS6 

pCPS7 

pCPSll 

PCPS17 

pCPSlS 

PCPS20 

PCPS23 

PCPS25 

PCPS26 

PCPS27 



replication functions pUC, Amp^ 
replication functions pUC, Amp*^ 
replication functions pUC, Art^^ 
replication functions pUC, Amp^ 
pIC19R containing spc^ gene of pDL282 
replication functions of pBR322 and 
PVT736-1, Amp^, Spc^ 

pIC-spc containing the truncated phoA gene 
of pPH07 as a Pstl-BamHI fragment 
contains truncated phoA gene 
PPHOS2 containing chromosomal suis DNA 
PKUN19 containing 6 kb Hindi I I fragment 
of cps ope r on 

pKaN19 containing 3,5 kb iTcoRI-Hindlll fragment 
of cps operon 

pCPS7 in which 0.4 kb Pstl-BainHI fragment 

of cpsB gene is replaced by Spc^ gene of pIC-spc 

pKUN19 containing 3.1 kb fQ?nI fragment 

of cps operon 

pKUN19 containing 1.8 kb SnaBI fragment 
of cps operon 
pKUNl9 containing 



3.3 kb Xbal-ffindlll 



fragment of cps operon 
pGEM7 Z f C + ) containing 
of cps operon 
PIC20R containing 2.5 
of pCPS17 

pKUN19 containing 3.0 
of cps operon 

PCPS25 containing 2.3 kb Xbal 
fragment of pCPS20 



1.5 kb Mlul fragment 
kb i^nl-Sall fragment 
kb Hindi II fragment 
(blunt) -Cial 



(23) 
Pr omega Corp. 
(29) 
(29) 

labcollection 
(43) 
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pCPS28 


pCPS27 containing the 1.2 kb Pstl-Xhol Spc^ 


this 


work 


(Fig.l) 






gene of pIC-spc 








pCPS29 


pKONlS containing 2.2 kb SacZ-PstZ fragment 
of ops operon 


this 


work 


(Fig.l) 


5 


pCPSl-X 


pK0N19 containing 5 kb BcoFN fragment 
of cps operon of type 1 


this 


work 


(Fig.l) 




pCPSl-2 


pKUK19 containing 2,2 kb Hindi 1 1 fragment 
of cps operon of type 1 


this 


work 


(Fig.l) 


10 


pCPS9-l 


pK0H19 containing 1 kb HincfIII-XI>aI 


this 


work 


(Fig.l) 




fragment of cps operon of serotype 9 










pCPS9-2 


pK0N19 containing 4.0 kb XbaT-Xt>aI 
fragment of cps operon of serotype 9 


this 


work 


(Fig.l) 
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Amp^: ampicillin resistant 
Spc*^: spectinomycin resistant 
20 cps : capsular polysaccharide 
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LEGENDS TO FIGURES 
Fig. 1. 

Genetic organization of the cps2 gene cluster. 
5 (A) The arrows represent potential Orfs. Gene designations are 
indicated below the arrows . 

(B) Physical map and genetic organization of the cps2 locus on 
the chromosome of S. suis serotype 2. 

Restriction sites are as follows: C: Clal; E, EcoRl; H, 
10 Hindlll; iCpnl ; Mlul; PstI; S, SnaBl; Sa: Sad; X, 

Xbal. 

(C) The f)NA fragments cloned in the various plasmids are 
indicated. 

15 P±g-2. 

Ethidium bromide stained agarose gel showing PGR products 
obtained with chromosomal DMA of S.suis strains belonging to 
the serotypes 1,2, 9 and 14 and cps2Jr cpsll and cps9H 

primer sets as described in Materials and Methods. (A) cpsll 
20 primers - 

(B) cps2tJ primers and (C) cps5H primers . Lanes 1-3: serotype 1 
strains; lanes 4-6: serotype 2 strains; lanes 7-9: serotype ^ 
strains; lanes 10-12: serotype 9 strains and lanes 13-15: 
serotype 14 strains . 

25 (B) Ethidium bromide stained agarose gel showing PGR products 
obtained with tonsillar swabs collected from pigs carrying 
S.suis type 2, type 1 or type 9 strains and cps2j, cpsll and 
cpsH primer sets as described in Materials and Methods . 
Bacterial DMA suitable for PGR was prepared by using the 

30 multiscreen methods as described previously (20) . (A) cpsll 
primers. (B) cps2J primers and (C) cps9H primers . Lanes 1-3: 
PGR products obtained with tonsillar swabs collected from pigs 
carrying S.suis type 1 strains; lanes 4-6: PGR products 
obtained with tonsillar swabs collected from pigs carrying 

35 S.suis type 2 strains; lanes 7-9: PGR procucts obtained with 
tonsillar swabs collected from pigs carrying S.suis type 9 



35 



strains; lanes 10-12: PCR products obtained with chromosomal 
DNA from serotype 9^ 2 and 1 strains respectively; lane 13: 
negative control, no DNA present. 

Figure 3 

CPS2 nucleotide sequences and corresponding amino acid 
sequeiices from the open reading frames. 

Figure 4 

CPSl nucleotide sequences and corresponding amino acid 
sequences from the open reading frames. 

Figure 5 

CPS9 nucleotide sequences and corresponding amino acid 
sequences from the open reading frames , 
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CIAIMS 



1. An isolated or recombinant nucleic acid encoding a capsular 
gene cluster of Streptococcus suls or a gene or gene fragment 
derived thereof. 

2. A nucleic acid according to claim 1 encoding a 

5 Streptococcus suis serotype-specif ic central region, 

preferably encoding at least one enzyme or fragment thereof 
involved -in polysaccharide biosynthesis. 

3. A nucleic acid according to claim 1 or 2 hybridising to a 
nucleic- acid encoding a gene derived, from a Streptococcus suis 

10 serotype 1, 2 or 9 capsular gene cluster. 

4. An isolated or recombinant nucleic acid encoding a capsular 
gene cluster of Streptococcus suis serotype 2 or a gene or 
gene fragment derived thereof, preferably as identified in 
Figure 3. 

15 5- An isolated or recombinant nucleic acid encoding a capsular 
gene cluster of Streptococcus suis serotype 1 or a gene or 
gene fragment derived thereof, preferably as identified in 
Figure 4 . 

6. An isolated or recombinant nucleic acid encoding a capsular 
20 gene cluster of Streptococcus suis serotype 9 or a gene or 

gene fragment derived thereof, preferably as identified in 
Figure 5. 

7 . A nucleic acid probe or primer derived from a nucleic acid 
according to anyone of claims 1 to 6 allowing species or 

25 serotype specific detection of Streptococcus suis. 

8. A probe or primer according to claim 7 provided with at 
least one reporter molecule . 

9. A diagnostic test comprising a probe or primer according to 
claim 7 or 8 . 

30 10. A protein or fragment thereof encoded by a nucleic acid 
according to anyone of claims 1 to 6, 
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11. A protein or fragment according to claim 10 capable of 
polysaccharide biosynthesis. 

12. A method to produce a Streptococcus suis capsular antigen 
comprising using a protein or fragment according to claim 11. 

13. A Streptococcus suis capsular antigen obtainable by a 
method according to claim 12 . 

14. A vaccine comprising an antigen according to claim 13 and 
further comprising a suitable carrier or adjuvant. 

15. A recombinant Streptococcus suis mutant provided with a 
modified capsular gene cluster. 

16. A recombinant micro-organism comprising at least a part of 
a capsular gene cluster of Streptococcus suis. 

17. A recombinant micro-organism according to claim 16 
comprising a lactic acid bacterium. 

18. A vaccine comprising a mutant according to claim 15 or a 
micro-organism according to claim 16 or 17. 
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ABSTRACT 



The invention relates to Streptococcus suis infections of pigs, 
to vaccines directed against those infections and to tests for 
diagnosing Streptococcus suis infections. 

The invention provides an isolated or recombinant nucleic acid 
encoding a capsular gene cluster of Streptococcus suis or a 
gene or gene fragment derived thereof. The invention 
furthermore provides a nucleic acid probe or primer allowing 

species pr serotype specific detection of Streptococcus suis. 

* 

The invention also provides a Streptococcus suis antigen and 
vaccine, derived thereof. 
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CPS*. 



AAGCTTGGAT ATTGATCACA TGATGGAGGT GATGGAAGCA TCTAAGTCTG CAGCGGGGTC GGCGTGCCCA 
AGTCCGCAGG CTTATCAGGC AGCTTTTGAG GGAGCTGAGA 

ACATTATCGT TGTGACGATT ACAGGTGGGC TATCGGGTAG TTTTAATGCG GCACGTGTAG CTAGGGATAT 
GTATATCGAA GAGCATCCGA ATGTCAATAT CCATTTGATA 

GATAGTTTGT CAGCCAGTGG GGAAATGGAT TTACTTGTAC ACCAAATCAA TCGCTTAATT AGTGCAGGAT 
TAGATTTTCC ACAAGTAGTA GAAGCGATAA CTCACTATCG 

GGAACACAGT AAGCTCCTCT TTGTTTTAGC GAAAGTTGAT AATCTTGTTA AGAATGGAAG ACTGAGCAAA 
TTGGTAGGCA CTGTCGTTGG TCTTCTCAAT ATCCGTATGG 

TTGGTGAGGC AAGTGCTGAA GGAAAATTAG AGTTGCTTCA AAAGGCGCGT GGTCATAAGA AATCTGTGAC 
AGCAGCCTTT GAAGAAATGA AAAAAGCAGG CTATGATGGT 

GGTCGAATTG TTATGGCCCA CCGCAACAAT GCTAAGTTCT TCCAACAATT CTCAGAGTTG GTAAAAGCAA 
GTTTTCCAAC GGCTGTTATT GACGAAGTTG CAACATCAGG 

TCTATGCAGT TTTTATGCTG AAGAAGGTGG ACTTTTGATG GGCTACGAAG TGAAAGCGTG ATTCACAGAG 
TAATAATTTT GGGCTGTAAT TTCCGCTATA GAATAATCCC 

CCTCTTCTTC TAAGTTCGAG GGGGATTGTT TGTATGAGAC TATTGGATTT CATTCATTCA AATATCTTAC 
GAATTGCTCC AGTTTATCTG CAAAATCTTG TTCAAAGAAG 

ATCTGTAAGA AATCAGCTTT CTGTCCGCTG AAATAATAAC ATTTTCCAAA CATGTGTTGG ATGCTAGGAG 
AAAGAATCCC CTTGCTTAGC TGAAAGGTCA CGCTCCCCTT 

TGGAATTCGA TACGGGATGT TTAAAGCGTA TTTCTCTAGA~<:^GTCTTTTA TTTTATTCCA TTGAGCGTGA 
TAAATGTGAT GAAGATGCTG TGTGTTCCGC GCAAACATAC 

CGTTATCAAT GTAGAGCGAG AGAGCTTTTT GCATGATAAG ATTGGTATCG TAGTCGATTA GACTCTTATG 
TTTGATGAAG ATATCACGTA GCTGATTAGG AAGGCTGATT 

GCACCGATTC GGAGGGCAGG AAAGAGTGTC GGTGTAAAAG ATTTTATATA GATGACGCGA TTATCTGTAT 
CAAGATAGTG TAAAGGTAGG CTATGACTAG AGTCGAAATC 

TGCTAAATAG TCATCCTCAA TGATGTAGAC ATCGTATTGC TTTGCTAATT TTACGATGGC TGTTTTTGTT 
GCTATATCAT AGGTTGAACC GAGAGGGTTG TGCAAGCGAG 

GAATTGTGTA GAAAAACTTA ATTTTTCCAG TTTGGAAGAT ACTTTCCAAT TCTTCTAGGT CAATTCCATC 
TAAATTCCGT TCAATTGTTT GATAGGGGAT TCCTTGATGT 

CGAATGAGCT CTATCATTCG TGAATAGGTA GGGTTCTCTA TCAAGATTTC CGTTTTTCCA GCCAAGGTTT 
CCATTTGTGT GAGAATATAT AGAGCTTGTT GACTACCAGC 

TGTGATAACC AGCTGGTCTT TTTTTGTATA GACATGATAG TCCATTAACA GACTTTGAAC GGAGGAAATC 
AATTCTGCCA ATCCCTCTTG CTGGTGATAG TAGTTGAATA 

GGTAATTTTC CCGCCCAATA AGACTTTCTT TTAGACAAAT CCGAAAATCT TCATAGGTAA TTCTTGAAAG 
TCTGTAGGAT TGAGCTCTAC AGGTATGGTC TTGGAAATCT 

CTATCCTCTA AGATATAATA ACCGCTTTTT TCGACAGCGT AGATCTTATT TTGGTATTTT AATTCCAACA 
TAGCCTTTTG GACAGTGTCT TTGCTACAAT GATATTGCTC 

GCGGAGTTGA CGGATAGAAG GTAATTTCTC TCCACGTTTG AATCGATGTT CCTCTATTCC AGTCAAAATA 
TCTTGGATGA TAACTTGATA TTTTTTCATC TAGGTCCCCT 

TTTTTATAGA CTATGTTACT AGCTAGTATA TAGAAAATU^T TGAAGAAAGA CAATATATGA ATAATGGGGT 
TGAGGTTCAG GAATTAAGCT ACTCTATGGT ATAATTAAGT 

GATGAAAATA ATTATACCTA ATGCAAAAGA AGTAAATACA AATCTAGAGA ATGCCTCGTT TTATCTCCTG 
TCTGATCGAA GCAAGCCGGT GCTGGATGCC ATAAGTCAAT 

TTGATGTAAA AAAGATGGCT GCCTTTTATA AATTGAATGA AGCAAAGGCT GAGTTAGAAG CTGACCGTTG 
GTATCGAATC AGGACAGGTC AAGCAAAAAC CTATCCAGCC 

TGGCAGTTAT ATGATGGTCT CATGTATCGT TATATGGATA GGCGAGGTAT AGATTCGAAA GAAGAAAATT 
ATTTACGTGA CCACGTTCGT GTAGCGACAG CCTTATACGG 

ATTGATTCAT CCTTTTGAAT TCATTTCACC TCACCGCTTA GATTTTCAAG GGAGCTTAAA GATAGGCAAT 
CAGTCTTTGA AACAGTACTG GCGACCGTAT TATGACCAAG 

AAGTTGGTGA TGATGAACTG ATTCTCTCAC TGGCTTCGTC AGAATTTGAG CAGGTGTTTT CTCCNCAGAT 
TCAGAAAAGA TTAGTTAAAA TTCTTTTCAT GGAAGAAAAA 

GCAGGTCAGC TAAAAGTTCA CTCGACTATA TCAAAAAAAG GCAGAGGAAG ATTGCTGTCC TGGTTGGCTA 
AGAACAATAT TCAGGAATTA TCGGACATTC AAGATTTTAA 

GGTGGATGGC TTTGAATATT GTACTTCCGA ATCAACGGCA AACCAACTTA CCTTCNTACG ATCAATAAAA 
ATGTGAAATT ATGAAA/^GA TAACGTTTTC CAGCGCTAAA 

AAGGGTAGAA AAATATTAAT TTCTATGATA TAATGGATGC GTTATAGGTA AAAGTCTAGG AAGGTTGTTT 
ATGAAAAAGA GAAGCGGACG AAGTAAGTCG TCCAAGTTCA 

AATTGGTAAA TTTTGCGCTT TTGGGACTTT ATTCCATTAC TCTATGTTTG TTCTTAGTGA CCATGTATCG 
CTATAACATC CTAGATTTCC GGTATTTAAA CTATATTGTG 

ACGCTTTTGC TAGTAGGAGT GGCAGTATTG GCTGGATTAT TGATGTGGCG TAAGAAAGCG CGCATATTTA 
CAGCGCTCTT ACTTGTTTTT TCACTGGTCA TCACGTCTGT 

TGGGATCTAT GGAATGCAAG AAGTTGTAAA ATTTTCAACA CGACTAAATT CAAATTCGAC ATTTTCAGAA 
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TATGAAATGA GTATCCTTGT CCCAGCAAAT AGTGATATTA 

CGGACGTTCG TCAGCTTACT AGTATCCTTG CTCCAGCCGA ATACGACCAA GATAACATCA CCGCTTTATT 
GGATGACATA TCCAAAATGG AATCTACTCA ACTAGCAACT 

AGCCCCGGGA CTTCTTACCT GACAGCATAT CAATCTATGT TGAATGGCGA GAGTCAAGCG ATGGTGTTCA 
ACGGAGTTTT TACCAATATT TTAGAAAATG AAGATCCAGG 

CTTTTCTTCA AAAGTGAAAA AAATATATAG TTTCAAAGTG ACTCAGACTG TTGAAACAGC TACTAAGCAG 
GTGAGTGGAG ATAGCTTTAA TATCTATATT AGTGGTATTG 

ATGCTTATGG ACCGATTTCT ACGGTCTCTC GTTCAGATGT CAATATCATT ATGACTGTCA ATCGTGCGAC 
ACATAAGATT TTATTGACAA CTACTCCACG AGATTCATAC 

GTTGCTTTCG CAGATGGCGG GCAAAATCAA TACGATAAAC TAACACATGC TGGTATTTAC GGTGTCAATG 
CTTCTGTGCA CACCTTAGAA AATTTTTATG GGATTGACAT 

TAGCAATTAT GTGCGGTTGA ACTTCATTTC CTTCCTTCAA TTAATCGACT TGGTGGGTGG AATTGATGTA 
TATAACGATC AAGAATTTAC AAGTTTACAT GGGAATTATC 

ATTTCCCTGT TGGACAAGTT CATTTAAACT CAGACCAAGC ATTAGGCTTC GTTCGAGAGC GCTACTCTTT 
AACAGGGGGT GACAATGACC GTGGTAAAAA CCAGGAAAAA 

GTGATTGCTG CCTTGATTAA AAAGATGAGT ACGCCAGAGA ATCTAAAAAA TTACCAGGCA ATCCTATCTG 
GATTGGAAGG CTCAATTCAA ACGGATTTGA GCTTAGAAAC 

GATTATGAGT TTAGTGAATA CCCAACTAGA ATCAGGAACA CAATTTACAG TAGAGTCACA AGCATTGACA 
GGAACAGGAC GCTCAGACTT ATCTTCTTAT GCGATGCCTG~* 

GATCACAACT TTATATGATG GAAATTAACC AAGATAGTCT GGAGCAATCA AAGGCAGCGA TTCAGTCCGT 
ACTTGTTGAA AAATAAAGAT TTTAGGAGAA AATATGAACA 

ATCAAGAAGT AAATGCAATC GAAATCGATG TTTTATTCTT ACTAAAAACA ATTTGGAGAA AGAAATTTTT 
AATTCTCTTA ACTGCAGTGT TGACTGCGGG GTTGGCATTT 

GTCTACAGTA GTTTTTTAGT GACACCTCAA TATGACTCCA CTACCCGTAT CTATGTAGTG AGTCAAAATG 
TTGAAGCCGG TGCGGGCTTG ACTAACCAAG AGTTACAAGC 

GGGTACCTAT TTGGCAAAAG ACTATCGGGA AATTATCCTA TCACAAGATG TNTTGACACA AGTAGCAACG 
GAATTGAATC TGAAAGAGAG TTTGAAAGAA AAAATATCAG 

TTTCTATTCC TGTTGATACT CGTATCGTTT CTATTTCTGT GCGTGATGCG GATCCAAATG AAGCGGCACG 
TATTGCAAAT AGCCTTCGCA CCTTTGCAGT GCAAAAGGTT 

GTTGAGGTCA CCAAGGTAAG CGATGTGACG ACACTTGAAG AAGCAGTCCC AGCGGAAGAA CCAACCACTC 
CAAATACAAA ACGAAATATC TTGCTTGGTT TATTAGCTGG 

AGGTATCTTG GCAACAGGTC TTGTACTGGT TATGGAGGTT TTGGATGACC GTGTAAAACG TCCTCAGGAC 
ATCGAAGAGG TAATGGGATT GACATTGCTA GGTATAGTAC 

CAGATTCGAA GAAATTAAAA TAGGAGAACA ATATGGCGAT GTTAGAAATT GCACGTACAA AAAGAGAGGG 
AGTAAATAAA ACCGAGGAGT ATTTCAATGC TATCCGTACC 

AATATTCAGC TTAGCGGAGC AGATATTAAG GTTGTTGGTA TTACCTCTGT TAAATCGAAT GAAGGTAAGA 
GTACAACTGC GGCTAGTCTC GCTATTGCCT ATGCTCGTTC 

AGGTTATAAG ACCGTCTTGG TGGATGCAGA TATCCGAT^T TCAGTCATGC CTGGTTTCTT CAAGCCAATT 
ACAAAGATTA CAGGTTTGAC GGATTACCTA GCAGGGACAA 

CAGACTTGTC TCAAGGATTA TGCGATACAG ATATTCCAAA CTTGACCGTA ATTGAGTCAG GAAAGGTTTC 
TCCCAACCCT ACTGCCCTTT TACAAAGTAA GAATTTTGAA 

AATCTACTTG CGACTCTTCG TCGCTATTAT GATTATGTTA TCGTTGACTG TCCACCATTA GGACTGGTAA 
TTGATGCAGC TATCATTGCA CAAAAATGTG ATGCGATGGT 

TGCAGTAGTA GAAGCAGGCA ATGTTAAGTG CTCATCTTTG AAAAAAGTAA AAGAGCAGTT GGAACAAACA 
GGCACACCGT TCTTAGGCGT TATCTTGAAC AAATATGATA 

TTGCCACTGA GAAGTATAGT GAATACGGAA ATTACGGCAA AAAAGCCTAA TTTCTCAGAT AACATAAGTT 
TGATAAGTAG GTATTAATAT GATTGATATC CATTCGCATA 

TCATATTTGG TGTGGATGAC GGTCCCAAAA CTATTGAAGA GAGCCTGAGT TTGATAAGCG AAGCTTATCG 
TCAAGGTGTT CGCTATATCG TAGCGACATC TCATAGACGA 

AAAGGGATGT TTGAAACAGC AGAAAAAATC ATCATGATTA ACTTTCTTCA ACTTAAAGAG GCAGTAGCAG 
AAGTTTATCC TGAAATACGA TTGTGCTATG GTGCTGAATT 

GTATTATAGT AAAGATATCT TAAGCAAACT TGAAAAAAAG AAAGTACCAA CACTTAATGG CTCGTGCTAT 
ATTCTCTTGG AGTTCAGTAC GGATACTCCT TGGAAAGAGA 

TTCAAGAAGC AGTGAACGAA ATGACGCTAC TTGGGCTAAC TCCCGTACTT GCCCATATAG AGCGTTATGA 
TGCTCTGGCA TTTCAGTCAG AGAGAGTAGA AAAGCTAATT 

GACAAGGGAT GCTACACTCA GGTAAATAGT AACCATGTGT TGAAGCCTGC TTTAATTGGC GAACGAGCAA 
AAGAATTTAA AAAACGTACT CGATATTTTT TAGAGCAGGA 

TTTAGTACAT TGTGTTGCTA GCGATATGCA TAATTTATAT AGTAGACCTC CGTTTATGAG GGAGGCGTAT 
CAGCTTGTAA AAAAAGAGTA TGGTGAGGAT AGAGCGAAGG 

CTTTGTTCAA GAAAAATCCT TTGTTGATAT TGAAAAATCA AGTACAGTAA CCTCATAGAA ATAGTGGAGG 
AGCTATGAAT ATTGAAATAG GATATCGCCA AACGAAATTG 
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GCATTGTTTG ATATGATAGC AGTTACGATT TCTGCAATCT TAACAAGTCA TATACCAAAT GCTGATTTAA 
ATCGTTCTGG AATTTTTATC ATAATGATGG TTCATTATTT 

TGCATTTTTT ATATCTCGTA TGCCGGTTGA ATTTGAGTAT AGAGGTAATC TGATAGAGTT TGAAAAAACA 
TTTAACTATA GTATAATATT TGTAATTTTT CTTATGGCAG 

TTTCATTTAT GTTAGAGAAT AATTTCGCAC TTTCAAGACG TGGTGCCGTG TATTTCACAT TAATAAACTT 
CGTTTTGGTA TACCTATTTA ACGTAATTAT TAAGCAGTTT 

AAGGATAGCT TTCTATTTTC GACAACCTAT CAAAAAAAGA CGATTCTAAT TACAACGGCT GAACTATGGG 
AAAATATGCA AGTTTTATTT GAATCAGATA TACTATTTCA 

AAAAAATCTT GTTGCATTGG TAATTTTAGG TACAGAAATA GATAAAATTA ATTTACCATT ACCGCTCTAT 
TATTCTGTTG AAGAAGCTAT AGGGTTTTCA ACAAGGGAAG 

TGGTCGACTA CGTCTTTATA AATTTACCAA GTGAATATTT TGACTTAAAG CAATTAGTTT CAGACTTTGA 
GTTGTTAGGT ATTGATGTAG GCGTTGATAT TAATTCATTC 

GGTTTTACTG TGTTGAAGAA TAAAAAAATC CAAATGCTAG GTGACCATAG CATCGTCACT TTTTCCACAA 
ATTTTTATAA GCCTAGTCAC ATCTGGATGA AACGACTTTT 

AGATATACTT GGAGCAGTAG TCGGGTTAAT TATTAGTGGT ATAGTTTCTA TTTTGTTAAT TCCAATTATT 
CGTAGAGATG GTGGGCCAGC CATTTTTGCT CAGAAACGAG 

TTGGACAGAA TGGACGCATA TTTACATTCT ACAAGTTTCG TTCGATGTTT GTTGATGCCG AGGTACGTT^ 
GAAAGAATTA ATGGCTCAAA ACCAGATGCA AGGTGGGATG 

TTCAAAATGG ACAACGATCC TAGAATTACT CC/^TTGGAC-fifcTTCATACG AAAAACAAGT TTAGATGAGT 
TACCACAATT TTATAATGTT CTAATTGGAG ATATGAGTCT 

AGTCGGTACC CGTCCGCCTA CAGTTGATGA ATTTGAAAAA TATACTCCTA GTCAAAAGAG AAGATTGAGT 
TTTAAACCAG GGATTACAGG TCTTTGGCAA GTGAGCGGAA 

GAAGTGATAT CACAGATTTT AATGAAGTCG TTAGGCTGGA CCTAACATAC ATTGATAATT GGACCATCTG 
GTCAGACATT AAGATTTTAT TGAAGACAGT GAAAGTTGTA 

TTGTTGAGAG AGGGAGGTCA GTAAGACTCC TTTAAAACAA AGAATAGTAG TAGGGGATAT GAGAACAGTT 
TATATTATTG GTTCAAAAGG AATACCAGCA AAGTATGGTG 

GTTTCGAGAC TTTCGTAGAA AAATTAACTG AGTATCAGAA AGATAAATCA ATTAATTATT TTGTTGCATG 
TACAAGAGAA AATTCAGCAA AATCAGATAT TACAGGAGAA 

GTTTTTGAAC ATAATGGAGC AACATGTTTT AATATTGATG TGCCAAATAT TGGTTCAGCA AAAGCCATTC 
TTTATGATAT TATGGCTCTC AAGAAATCTA TTGAAATTGC 

CAAAGATAGA AATGATACCT CTCCAATTTT CTACATTCTT GCTTGTCGGA TTGGTCCTTT CATTTATCTT 
TTTAAGAAGC AGATTGAATC AATTGGAGGT CAACTTTTCG 

TAAACCCAGA CGGTCATGAA TGGCTACGTG AAAAGTGGAG TTATCCCGTC CGACAGTATT GGAAATTTTC 
TGAGAGTTTG ATGTTAAAAT ACGCTGATTT ACTAATTTGT 

GATAGCAAAA ATATTGAAAA ATATATTCAT GAAGATTATC GAAAATATGC TCCTGAAACA TCTTATATTG 
CTTATGGAAC AGACTTAGAT AAATCACGCC TTTCTCCGAC 

AGATAGTGTA GTACGTGAGT GGTATAAGGA GAAGGAAATT TCAGAAAATG ATTACTATTT GGTTGTTGGA 
CGATTTGTGC CTGAAAATAA CTATGAAGTA ATGATTCGAG 

AGTTTATGAA ATCATATTCA AGAAAAGATT TTGTTTTGAT AACGAATGTA GAGCATAATT CCTTTTATGA 
GAAATTGAAA AAAGAAACAG GGTTCGATAA AGATAAGCGT 

ATAAAGTTTG TTGGAACAGT CTATAATCAG GAGCTGTTAA AATATATTCG TGAAAATGCA TTTGCTTATT 
TTCATGGTCA CGAGGTTGGA GGAACGAACC CATCTTTACT 

TGAAGCACTT TCTTCTACTA AACTAAATCT TCTTCTAGAT GTGGGCTTTA ATAGAGAAGT AGGGGAAGAA 
GGAGCGAAAT ACTGGAATAA AGATAATCTT CACAGAGTTA 

TTGACAGTTG TGAGCAATTA TCACAAGAAC AAATTAATGA TATGGATAGT TTATCAACAA AACAAGTCAA 
AGAAAGATTT TCTTGGGATT TTATTGTTGA TGAGTATGAG 

AAGTTGTTTA AAGGATAAGT TATGAAAAAG ATTCTATATC TCCATGCTGG AGCAGAATTA TATGGGGCAG 
ATAAGGTTCT CTTGGAACTT ATAAAAGGCT TAGATAAGAA 

TGAATTTGAA GCGCATGTTA TCCTACCTAA TGATGGAGTC CTAGTGCCAG CATTAAGAGA AGTTGGTGCG 
CAAGTTGAAG TTATTAACTA TCCAATTCTA CGTAGGAAAT 

ATTTTAATCC AAAAGGGATT TTTGACTACT TCATATCATA TCATCACTAT TCTAAACAGA TTGCTCAATA 
TGCCATAGAA AATAAGGTTG ACATAATTCA CAATAATACT 

ACCGCTGTCT TAGAAGGCAT TTATCTGAAG CGAAAACTCA AATTACCTTT GTTGTGGCAT GTTCATGAGA 
TTATTGTCAA ACCTAAATTC ATCTCTGATT CGATCAATTT 

TTTAATGGGG CGTTTTGCTG ATAAGATTGT GACAGTTTCA CAGGCTGTGG CAAACCATAT AAAACAATCA 
CCTCATATCA AAGATGACCA AATCAGTGTA ATCTACAATG 

GGGTAGATAA TT^AAGTGTTT TATCAGTCCG ATGCTCGGTC TGTTCGAGAA AGATTTGACA TTGACGAAGA 
GGCTCTTGTC ATTGGTATGG TCGGTCGAGT CAATGCGTGG 

AAAGGACAAG GAGATTTTTT AGAAGCAGTT GCTCCTATAC TCGAACAGAA TCCAAAAGCT ATCGCCTTTA 
TAGCAGGAAG TGCTTTTGAft GGAGAAGAGT GGCGAGTAGT 

AGAATTAGAA AAGAAGATTT CTCAATTAAA GGTCTCTTCT CAAGTCAGAC GAATGGATTA TTATGCAAAT 
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ACCACTGAAT TATATAATAT GTTTGATATT TTTGTACTTC 

CAAGTACTAA TCCAGACCCT CTACCAACGG TTGTACTAAA AGCAATGGCA TGCGGTAAAC CTGTTGTCGG 
TTACCGACAT GGTGGTGTTT GTGAGATGGT GAAAGAAGGT 

GTTAACGGTT TCTTAGTCAC TCCGAACTCA CCGTTAAATT TATCAAAAGT AATTCTTCAG TTATCGGAAA 
ATATAAATCT CAGAAAAAAA ATTGGTAATA ATTCTATAGA 

ACGTCAAAAA GAACATTTTT CGTTAAAAAG CTATGTAAAA AATTTTTCGA AAGTCTACAC CTCCCTCAAA 
GTATACTGAT TGGCTGAAGT GAATGCTTTA GTATAGCGAT 

TTATCGTATT CTCATTCGAT AAAACAAATG TTCAGAAACA GTTATAAGTT ATTTCTAAAG GGCACCTCTA 
TAAACTCCCA AAATTGCGAA TTTGGAGTTA CGAAAGCCTT 

GTTAAATCAA CATTTTAAAT TTTAGAAAAT TAGTTTTTAG AGCTCCCCTA AAATAGAAGA TAACAGAAGG 
GAGCCTTCAA AAACTTCATT TTTAATTGGA TTGTAGAAAA 

ACTGTTAAAT CAATATTTAG ATTTTTAGGA GTTCAGTTTT TGGGGGGAGA GCTTAATAAT CTATGCACTA 
TATTTCGAAA AATATATGGT GTAAAATCAG AACTGATGGT - 

CGTGGCAAAA AAGAGAATGA GGAATTTATG AAAATTATTT CTTTTACAAT GGTTAATAAC GAAAGTGAGA 
TAATAGAGTC ATTTATACGG TATAATTATA ACTTTATTGA 

CGAGATGGTC ATTATTGATA ATGGTTGTAC AGATAACACG ATGCAAATTA TTTTTAATTT GATTAAAGAG 
GGATATAAAA TATCCGTATA TGATGAGTCT TTAGAGGCAT 

ATAATCAGTA TCGACTTGAT AATAAATATC TAACGAAAAT AATTGCTGAA AAAAATCCAG ATTTGATAAT 
ACCTTTGGAT GCGGATGAAT TTTTAACAGC CGATTCAAAT— * 

CCACGGAAAC TTTTGGAACA ACTGGACTTA GAAAAGATAC ATTATGTGAA TTGGCAATGG TTTGTTATGA 
CTAAAAAAGA TGATATTAAT GATTCGTTTA TACCACGTAG 

AATGCAATAT TGTTTTGAAA AACCTGTTTG GCATCATTCT GATGGTAAAC CAGTTACTAA ATGTATAATT 
TCCGCTAAGT ATTACAAAAA AATGAATTTA AAGCTATCGA 

XGGGACATCA CACTGTTTTT GGTAACCCAA ATGTAAGGAT AGAACATCAT AATGATTTGA AATTTGCACA 
TTATCGAGCT ATTAGCCAAG AGCAATTAAT TTATAAAACA 

ATTTGTTACA CTATTCGCGA TATTGCTACT ATGGAGAACA ATATCGAAAC AGCTCAAAGA ACAAATCAGA 
TGGCGCTCAT TGAATCTGGC GTGGATATGT GGGAAACGGC 

GAGAGAAGCC TCTTATTCAG GTTATGATTG TAATGTTATA CATGCACCAA TTGATTTAAG TTTTTGTAAA 
GAAAATATTG f AATAAAATA TAACGAACTA TCCAGAGAAA 

CAGTAGCAGA ACGCGTGATG AAAACGGGAA GAGAAATGGC TGTTCGTGCA TATAATGTGG AGCGAAAACA 
AAAAGAAAAG AAATTTCTAA AACCTATTAT ATTTGTATTA 

GATGGGTTAA AAGGAGATGA GTATATTCAT CCCAATCCAT CAAATCATTT GACGATCTTA ACTGAAATGT 
ATAACGTCAG AGGCTTACTT ACCGATAATC ACCAAATTAA 

ATTTCTCAAA GTTAATTATA GATTAATTAT AACTCCAGAT TTTGCTAAGT TTTTACCGCA TGAATTTATT 
GTTGTACCAG ATACCTNGGA TATAGAGCAA GTTAAAAGCC 

AGTATGTTGG TACAGGTGTA GACTTGTCAA AGATTATTTC TTTAAAAGAG TATCGAAAAG AGATAGGCTT 
TATTGGTAAT TTGTATGCGC TTTTAGGATT TGTTCCGAAT 

ATGCTCAATA GAATTTATCT ATATATTCAG AGAAACGGTA TTGCAAACAC TATTATAAAA ATCAAGTCGA 
GATTGTGAGA GTTGTTTACT TTTATTTGTA ATTTTAAAAG 

TAATGCAGGC AGATAGGAGA AAAACGTTTG GAAAAATGAG AATAAGAATT AATAATTTGT TTTTTGTTGC 
CATAGCGTTT ATGGGCATAA TTATTAGTAA TTCGCAAGTT 

GTTCTAGCGA TAGGCAAAGC TTCTGTGATT CAGTATCTAT CTTATTTAGT TTTGATTTTA TGTATAGTTA 
ATGATTTATT AAAAAATAAC AAACATATTG TAGTTTATAA 

ATTAGGGTAT TTGTTTCTTA TTATATTTTT ATTTACTATC GGAATATGTC AGCAAATTCT TCCTATAACA 
ACTAAAATAT ATTTATCAAT TTCAATGATG ATTATTTCAG 

TTTTAGCAAC GTTGCCAATA AGTTTGATAA AAGATATTGA TGATTTTAGA CGGATTTCAA ATCATTTGTT 
ATTCGCTCTT TTTATAACTT CGATATTAGG AATAAAGATG 

GGGGCAACGA TGTTCACGGG GGCAGTAGAA GGTATCGGTT TTAGTCAGGG TTTTAATGGA GGATTGACGC 
ATAAGAACTT TTTTGGAATA ACTATTTTAA TGGGGTTCGT 

ATTAACTTAC TTGGCGTATA AGTATGGTTC CTATAAAAGA ACGGATCGTT TTATTTTAGG ATTAGAATTG 
TTTTTGATTC TTATTTCAAA CACACGCTCA GTTTATTTAA 

TACTATTGCT TTTTCTATTT CTTGTTAATC TTGACAAAAT CAAAATAGAA CAAAGACAAT GGAGTACGCT 
TAAATATATT TCCATGCTAT TTTGTGCTAT TTTTTTATAC 

TATTTCTTTG GTTTTTTAAT AACACATAGT GATTCTTACG CTCATCGCGT TAATGGTCTT ATTAATTTTT 
TTGAGTATTA TAGAAATGAT TGGTTCCATC TAATGTTTGG 

TGCAGCGGAT TTGGCATATG GGGATTTAAC TTTAGACTAT GCTATAAGGG TTAGACGCGT TTTAGGTTGG 
AATGGAACGC TTGAAATGCC CTTACTGAGT ATTATGTTAA 

AAAATGGTTT TATCGGTCTG GTAGGGTATG GGATTGTTTT ATATAAACTT TATCGTAATG TAAGAATATT 
AAAAACAGAT AATATAAAAA CAATAGGAAA GTCTGTATTT 

ATCATTGTAG TCCTATCTGC AACAGTAGAA AATTATATTG TAAATTTAAG TTTTGTATTT ATGCCT^TAT 
GTTTTTGTTT ATTAAATTCT ATATCTACTA TGGAATCAAC 
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TATTAACAAA CAACTGCAAA CATAAATTGG CAGGAATAGA GTTTTGAGTT GCTATTAATT TGGTAGAGCA 
TATGTTCTAT AGGTGGCAAG ATAAAGATAG TATTTTTTAC 

ATGATGATTT TTATGATAGC AAAGCAAGTT ACGGCATAAA AGGAATTAGA GGATGGAAAA AGTCAGCATT 
ATTGTACCTA TTTTTAATAO GGAAAAGTAC TTAAGAGAGT 

GTTTAGATAG CATTATTTCC CAATCGTATA CTAATCTAGA GATTCTTTTG ATAGATGACG GTTCTTCAGA 
TTCATCAACG GATATATGTT TGGAATACGC AGAGCAAGAT 

GGTAGAATAA AACTTTTCCG GTTACCAAAT . GGTGGTGTTT CAAACGCAAG GAATTACGGT ATCAAAAATA 
GCACAGCAAA TTATATTATG TTTGTAGATT CTGATGATAT 

TGTTGACGGC AACATTGTTG AGTCCTTATA CACCTGTTTA AAAGAGAATG ATAGTGATTT GTCGGGAGGG 
TTACTTGCTA CTTTTGATGG AAATTATCAA GAATCTGAGC 

TGCAAAAGTG TCAAATTGAT TTGGAAGAGA TAAAAGAGGT GCGAGACTTA GGAAATGAAA ATTTTCCCAA 
TCATTATATG AGCGGTATCT TTAATAGCCC TTGTTGCAAA 

CTTTATAAGA ATATATATAT AAACCAAGGT TTTGACACTG AACAGTGGTT AGGAGAGGAC TTATTATTTA 
ATCTAAATTA TTTAAAGAAT ATAAAAAAAG TCCGCTATGT 

TAACAGAAAT CTTTATTTTG CCAGAAGAAG TTTACAAAGT ACTACAAATA CGTTTAAATA TGATGTTTTT 
ATTCAATTAG AAAATTTAGA AGAAAAAACT TTTGATTTGT 

TTGTTAAAAT ATTTGGTGGA CAATATGAAT TTTCTGTTTT TAAAGAGACG CTACAGTGGC ATATTATTTA 
TTATAGCTTA TTAATGTTCA AAAATGGAGA TGAATCGCTT 

CCAAAGAAAT TGCATATATT TAAGTATTTA TACAATAGGC^SlTTCTTTAGA TACTCT/^GT ATTAAACGAA 
CGTCCTCTGT TTTTAAAAGA ATATGTAAAT TAATTGTTGC 

TAATAATTTG TTTAAAATTT TTTTAAATAC TTTAATTAGG GAAGAAAAAA ATAATGATTA ACATTTCTAT 
CATCGTCCCA ATTTACAATG TTGAACAATA TCTATCCAAG 

TGTATAAATA GCATTGTAAA TCAGACCTAC AAACATATAG AGATTCTTCT GGTGAATGAC GGTAGTACGG 
ATAATTCGGA AGAAATTTGT TTAGCATATG CGAAGAAAGA 

TAGTCGCATT CGTTATTTTA AAAAAGAGAA CGGCGGGCTA TCAGATGCCC GTAATTATGG CATAAGTCGC 
GCCAAGGGTG ACTACTTAGC TTTTATAGAC TCAGATGATT 

TTATTCATTC GGAGTTCATC CAACGTTTAC ACGAAGCAAT TGAGAGAGAG AATGCCCTTG TGGCAGTTGC 
TGGTTATGAT AGGGTAGATG CTTCGGGGCA TTTCTTAACA 

GCAGAGCCGC TTCCTACAAA TCAGGCTGTT CTGAGCGGCA GGAATGTTTG TAAAAAGCTG CTAGAGGCGG 
ATGGTCATCG CTTTGTGGTG GCCTGGAATA AACTCTATAA 

AAAAGAACTA TTTGAAGATT TTCGATTTGA AAAGGGTAAG ATTCATGAAG ATGAATACTT CACTTATCGC 
TTGCTCTATG AGTTAGAAAA AGTTGCAATA GTTAAGGAGT 

GCTTGTACTA TTATGTTGAC CGAGAAAATA GTATCATAAC TTCTAGTATG ACTGACCATC GCTTCCATTG 
CCTACTGGAA TTTCAAAATG AACGAATGGA CTTCTATGAA 

AGTAGAGGAG ATAAAGAGCT CTTACTAGAG TGTTATCGTT CATTTTTAGC CTTTGCTGTT TTGTTTTTAG 
GCAAATATAA TCATTGGTTG AGCAAACAGC AAAAGAAGCT 

TCTCCAAACG CTATTTAGAA TTGTATATAA ACAATTGAAG CAAAATAAGC GACTTGCTTT ACTAATiSAAT 
GCTTATTATT TGGTAGGGTG TCTTCATCTT AATTTTAGTG 

TCTTTCTGAA AACGGGGAAA GATAAAATTC AAGAAAGATT GAGAAGAAG"^ GAAAGTAGTA CTCGGTAAGA 
ATGTTGTAAT AAATGGTTGA AAGAAAAGGG GATTAAAATG 

AATCCAACAA ATAGTAGAAT AGCACTCTTT GATACGATTA AATGTATCAT GGTACTTTGT GTTATTTTTA 
CACATCTGGA TTGGTCTGTT GAGCAGCGTC CATGGTTTAT 

CTTTCCGTAT TTCGTTGACA TGGCTGTTCC AATTTTCNGT TGCTTCTGCC TATTTTCN 
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SLDIDHMMEVMEASKSAAGSACPSPQAYQAAFEGAENIIWTITGGLSGSFNAARVARDM 
YIEEHPNVNIHLIDSLSASGEMDLLVHQINRLISAGLDFPQWEAITHYREHSKLLFVLA 
KVDNLVKNGRLSKLVGTWGLLNIRMVGEASAEGKLELLQKARGHKKSVTAAFEEMKKAG 
YDGGRIVMAHRNNAKFFQQFSELVKASFPTAVIDEVATSGLCSFYAEEGGLLMGYEVKA 
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ORF y 



MKKYQVIIQDILTGIEEHRFKRGEKLPSIRQLREQYHCSKDTVQKAMLELKYQNKIYAVE 
KSGYYILEDRDFODHTCRAQSYRLSRITYEDFRICLKESLIGRENYLFNYYHQQEGLAEL 

issvqsllmdyhvytkkdqlvitagsqqalyiltqmetlagkteilienptysrmielir 
hqgipyqtiernldgidleelesifqtgkikffytiprlhnplgstydiatktaivklak 
qydvyiieddyladfdsshslplhyldtdnrviyiksftptlfpalrigaislpnqlrdi 
fikhkslidydtnlimqkalslyidngmfarntqhlhhiyhaqwnkikdclekyalnipy 

RIPKGSVTFQLSKGILSPSIQHMFGKCYYFSGQKADFLQIFFEQDFADKLEQFVRYLNE 
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ORF X 



MKIIIPNAKEVNTNLENASFYLLSDRSKPVLDAISQFDVKKMAAFYKLNEAKAELEADRW 
YRIRTGQAKTYPAWQLYDGLMYRYMDRRGIDSKEENYLRDHVRVATALYGLIHPFEFISP 
HRLDFQGSLKIGNQSLKQYWRPYYDQEVGDDELILSLASSEFEQVFSPQIQKRLVKILFM 
EEKAGQLKVHSTISKKGRGRLLSWLAKNNIQELSDIQDFKVDGFEYCTSESTANQLTFXR 
SIKM 
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CPS2A 



MKKRSGRSKSSKFKLVNFALLGLYSITLCLFLVTMYRYNILDFRYLNYIVTLLLVGVAVL 
AGLLMWRKKARIFTALLLVFSLVITSVGIYGMQEWKFSTRLNSNSTFSEYEMSILVPAN 
SDITDVRQLTSILAPAEYDQDNITALLDDISKMESTQLATSPGTSYLTAYQSMLNGESQA 
MVFNGVFTNILENEDPGFSSKVKKIYSFPCVTQTVETATKQVSGDSFNIYISGIDAYGPIS 
TVSRSDVNIIMTVNRATHKILLTTTPRDSYVAFADGGQNQYDKLTHAGIYGVNASVHTLE 
NFYGIDISNYVRLNFISFLQLIDLVGGIDVYNDQEFTSLHGNYHFPVGQVHLNSDQALGF 
VRERYSLTGGDNDRGKNQEKVIAALIKKMSTPENLKNYQAILSGLEGSIQTDLSLETIMS 
LVNTQLESGTQFTVESQALTGTGRSDLSSYAMPGSQLYMMEINQDSLEQSKAAIQSVLVE 
K 
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CPS2B 



MNNQEVNAIEIDVLFLLKTIWRKKFLILLTAVLTAGLAFVYSSFLVTPQYDSTTRIYWS 
QNVEAGAGLTNQELQAGTYLAKDYREIILSQDVLTQVATELNLKESLKEKISVSIPVDTR 
IVSISVRDADPNEAARIANSLRTFAVQKWEVTKVSDVTTLEEAVPAEEPTTPNTKRNIL 
LGLLAGGILATGLVLVMEVLDDRVKRPQDIEEVMGLTLLGIVPDSKKLK 



Fig* 3 cont. 



13/38 



CPS2C 



MAMLE I ARTKREGVNKTEE Y FNAI RTNIQLSGADI KWG I T S VKSNEGKSTTAASLAI AY 
ARSG YKTVLVDAD I RNS VMPGFEKP I TKI TGLT D YLAGTT DLSQGLCDTD I PNLTVI ES G 
KVSPNPTALLQSKNFENLLATLRRYYDYVIVDCPPLGLVIDAAIIAQKCDAMVAWEAGN 
VKCSSLKKVKEQLEQTGTPFLGVILNKYDIATEKYSEYGNYGKKA 
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CPS2D 



MIDIHSHIIFGVDDGPKTIEESLSLISEAYRQGVRYIVATSHRRKGMFETPEKIIMINFL 
QLKEAVAEVYPEIRLCYGAELYYSKDILSKLEKKKVPTLNGSCYILLEFSTDTPWKEIQE 
AVNEMTLLGLTPVLAHIERYDAIAFQSERVEKLIDKGCYTQVNSNHVLKPALIGERAKEF 
BCKRTRYFLEQDLVHCVASDMHNLYSRPPFMREAYQLVKKEYGEDRAKALFKKNPLLILKN 
QVQ 
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CPS2E 



M^IEIGYRQTKLALFDMIAVTISAILTSHIPNADLNRSGIFIIMMVHYFAFFISRMPVEF 
EYRGNLIEFEKTFNYS 1 1 FVI FLMAVS FMLENNFALSRRGAVYFTLINFVLVYLFNVI IK 
QFKDSFLFSTTYQKKTILITTAELWENMQVLFESDILFQBCNLVALVILGTEIDKINLPLP 
LYYSVEEAIGFSTREWDYVFINLPSEYFDLKQLVSDFELLGIDVGVDINSFGFTVLKNK 
KIQMLGDHSIVTFSTNFYKPSHIWMKRLLDILGAWGLIISGIVSILLIPIIRRDGGPAI 
FAQKRVGQNGRIFTFYKFRSMFVDAEVRKKE3>4AQNQMQGGMF?MDNDPRITPIGHFIRK 
TSLDELPQFYNVLIGDMSLVGTRPPTVDEFEKYTPSQKRRLSFKPGITGLWQVSGRSDIT 
DFNE WRLDLT YI DNWT I WS DIKI LLKTVKWLLREGGQ 
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CPS2F 



MRTVYIIGSKGIPAKYGGFETFVEKLTEYQKDKSINYFVACTRENSAKSDITGEVFEHNG 
ATCFNIDVPNIGSAKAILYDIMALKKSIEIAKDRNDTSPIFYILACRIGPFIYLFKKQIE 
SIGGQLFVNPDGHEWLREKWSYPVRQYWKFSESLMLKYADLLICDSKNIEKYIHEDYRKY 
APETSYIAYGTDLDKSRLSPTDSWREWYKEKEISENDYYLWGRFVPENNYEVMIREFM 
KSYSRKDFVLITNVEHNSFYEKLKKETGFDKDKRIKFVGTVYNQELLKYIRENAFAYFHG 
HEVGGTNPSLLEALSSTKLNLLLDVGFNREVGEEGAKYWNKDNLHRVIDSCEQLSQEQIN 
DMDS L S T KQ VKER FS WD FIVDEYEKL FKG 
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CPS2G 



MKKILYLHAGAEXiYGADKVLLELIKGLDKNEFEAHVILPNDGVLVPALREVGAQVEVINY 
PILRRKYFNPKGIFDYFISYHHYSKQIAQYAIENKVDIIHNISITTAVLEGIYLKRKLKLPL 
LWHVHEIIVKPKFISDSINFLMGRFADKIVTVSQAVANHIKQSPHIKDDQISVIYNGVDN 
KVFYQSDARSVRERFDIDEEALVIGMVGRVNAWKGQGDFLEAVAPILEQNPKAIAFIAGS 
AFEGEEWRWELEKKISQLKVSSQVXRMDYYANTTELYNMFDIFVLPSTNPDPLPTWLK 
AMACGKPWGYRHGGVCEMVKEGVNGFLVTPNSPLNLSBCVILQLSENINLRKKIGNNSIE 
RQKEH FS LKS YVKN FS KV YT S LKVY 
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CPS2H 



MKIISFTMVNNESEIIESFIRYNYNFIDEMVIIDNGCTDNTMQIIFNLIKEGYKISVYDE 
SLEAYNQYRLDNKYLTKI lAEKNPDLI I PLDADEFLTADSNPRKLLEQLDLEKI HYVNWQ 
WFVT^TKKDDINDSFIPRRMQYCFEKPVWHHSDGKPVTKCIISAKYYKKMNLKLSMGHHTV 
FGNPNVRIEHHNDLKFAHYRAISQEQLIYKTICYTIRDIATMENNIETAQRTNQMALIES 
GVDMWETAREASYSGYDCNVIHAPIDLSFCKENIVIKYNELSRETVAERVMKTGREMAVR 
AYNVERKQKEKKFLKPIIFVLDGLKGDEYIHPNPSNHLTILTEMYNVRGLLTDNHQIKFL 
KVNYRLIITPDFAKFLPHEFIWPDTXDIEQVKSQYVGTGVDLSKIISLKEYRKEIGFIG 
NL YALLG FV PNMLNRI YL Y I QRNG I ANT 1 1 KI KS RL , 
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CPS2I 



MQADRRKTFGKMRIRINNLFFVAIAFMGIIISNSQWLAIGKASVIQYLSYLVLILCIVN 
DLLKNNKHIWYJa.GYLFLIIFLFTIGICQQILPITTKIYLSISMMIISVLATLPISLIK 
DIDDFRRISNHLLFALFITSILGIKMGATMFTGAVEGIGFSQGFNGGLTHKNFFGITILM 
GFVLTYLAYKYGSYKRTDRFILGLELFLILISNTRSVYLILLLFLFLVNLDKIKIEQRQW 
STLKYISMLFCAIFLYYFFGFIilTHSDSYAHRVNGLINFFEYYRNDWFHLMFGAADIxAYG 
DLTLDYAIRVRRVLGWT^GTLEMPLLSIMLKNGFIGLVGYGIVLYKLYRNVRILKTDNIKT 
IGKSVFIIWLSATVENYIVNLSFVFMPICFCLLNSISTMESTINKQLQT 
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CPS2J 



MEKVSIIVPIFNTEKYLRECLDSIISQSYTNLEILLIDDGSSDSSTDICLEYAEQDGRIK 
LFRLPNGGVSNARNYGIKNSTANYIMFVDSDDIVDGNIVESLYTCLKENDSDLSGGLLAT 
FDGNYQESELQKCQIDLEEIKEVRDLGNENFPNHYMSGIFNSPCCKLYKNIYINQGFDTE 
QWLGEDLLFNLNYLPCNIKKVRYVNRNLYFARRSLQSTTNTFKYDVFIQLENLEEKTFDLF 
VKIFGGQYEFSVFKETLQWHIIYYSLLMFKNGDESLPKKLHIFKYLYNRHSLDTLSIKRT 
S S VFKRI CKL I VANN L FK I FLNTL I REEKNN D 
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CPS2K 



MINISIIVPIYNVEQYLSKCINSIVNQTYKHIEILLVNDGSTDNSEEICLAYAKKDSRIR 
YFKKENGGLSDARNYGISRAKGDYU^FIDSDDFIHSEFIQRLHEAIERENALVAVAGYDR 
VDASGHFLTAEPLPTNQAVLSGRNVCKKLLEADGHRFWAWNKLYKKELFEDFRFEKGKI 
HEDEYFTYRLLYELEPCVAI VKECLYYYVDRENS I ITSSMTDHRFHCLLEFQNERMDFYES 
RGDKELLLECYRSFLAFAVLFLGKYNHWLSKQQKKLLQTLFRIVYKQLKQNKRLALLMNA 
YYLVGCLHLNFSVFLKTGKDKIQERLRRSESSTR. 
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ATCGCCAAAC GAAATTGGCA TTATTTGATA TGATAGCAGT TGCAATTTCT GCAATCTTAA CAAGTCATAT 
ACCAAATGCT GATTTAAATC GTTCTGGAAT TTTTATCATA 

ATGATGGTTC ATTATTTTGC ATTTTTTATA TCTCGTATGC CAGTTGAATT TGAGTATAGA GGTAATCTGA 
TAGAGTTTGA AAAAACATTT AACTATAGTA TAATATTTGC 

AATTTTTCTT ACGGCAGTAT CATTTTTGTT GGAGAATAAT TTCGCACTTT CAAGACGTGG TGCCGTGTAT 
TTCACATTAA TAAACTTCGT TTTGGTATAC CTATTTAACG 

TAATTATTAA GCAGTTTAAG GATAGCTTTC TATTTTCGAC AATCTATCAA AAAAAGACGA TTCTAATTAC 
AACGGCTGAA CGATGGGAAA ATATGCAAGT TTTATTTGAA 

TCACATAAAC AAATTCAAAA AAATCTTGTT GCATTGGTAG TTTTAGGTAC AGAAATAGAT AAAATTAATT 
TATCATTACC GCTCTATTAT TCTGTGGAAG AAGCTATAGA 

GTTTTCAACA AGGGAAGTGG TCGACCACGT CTTTATAAAT CTACCAAGTG AGTTTTTAGA CGTAAAGCAA 
TTCGTTTCAG ATTTTGAGTT GTTAGGTATT GATGTAAGCG 

TTGATATTAA TTCATTCGGT TTTACTGCGT TGAAAAACAA AAAAATCCAA CTGCTAGGTG ACCATAGCAT 
TGTAACTTTT TCCACAAATT TTTATAAGCC TAGTCATATC 

ATGATGAAAC GACTTTTGGA TATACTCGGA GCGGTAGTCG GGTTAATTAT TTGTGGTATA GTTTCTATTT 
TGTTAGTTCC AATTATTCGT AGAGATGGTG GACCGGCTAT 

TTTTGCTCAG AAACGAGTTG GACAGAATGG ACGCATATTT ACATTCTACA AGTTTCGATC GATGTATGTT 
GATGCTGAGG AGCGCAAAAA AGACTTGCTC AGCCA7VAACC 

AGATGCAAGG GTGGGTATGT TTTAAAATGG GAAAAACGAT CCTAGAATTA CTCCAATTGG ACATTTCATA 
CGCAAAAACA AGTTTAGACG AGTTACCACA GTTTTATAAT 

GTTTTAATTG GCGATATGAG TCTAGTTGGT ACACGTCCAC CTACAGTTGA TGAATTTGAA AAATATACTC 
CTGGTCAAAA 'GAGACGATTG AGTTTTAAAC CAGGGATTAC 

AGGTCTCTGG CAGGTTAGTG GTCGTAGTAA TATCACAGAC TTCGACGACG TAGTTCGGTT GGACTTAGCA 
TACATTGATA ATTGGACTAT CTGGTCAGAT ATTAAAATTT 

TATTAAAGAC AGTGAAAGTT GTATTGTTGA GAGAGGGAAG TAAGTAAAAG TATATGAAAG TTTGTTTGGT 
CGGTTCTTCA GGGGGACATT TGACTCACTT GTATTTGTTA 

AAACCGTTTT GGAAGGAAGA AGAACGTTTT TGGGTAACAT TTGATAAAGA GGATGCAAGA AGTCTTTTGA 
AGAATGAAAA AATGTATCCA TGTTACTTTC CAACAAATCG 

CAATCTCATT "aATTTAGTGA AAAATACTTT CTTAGCTTTC AAAATTTTAC GTGATGAGAA ACCAGATGTT 

attatttcat ctggtgcggc cgttgctgtc cccttctttt 

acatcggaaa actatttgga gcaaagacga tttatattga agtatttgat cgagttaata aatctacatt 

AACTGGAAAA CTAGTTTATC CCGTAACAGA TATTTTTATT 

GTTCAGTGGG AAGAAATGAA GAAGGTATAT CCTAAATCTA TTAACTTGGG GAGTATTTTT TAATGATTTT 
TGTAACAGTA GGAACTCATG AACAACAGTT TAATCGATTG 

ATAAAAGAGA TTGATTTATT GAAAAAAAAT GGAAGTATAA CCGACGAAAT ATTTATTCAA ACAGGATATT 
CTGACTATAT TCCAGAATAT TGCAAGTATA AAAAATTTCT 

CAGTTACAAA GAAATGGAAC AATATATTAA CAAATCAGAA GTAGTTATTT GCCACGGAGG CCCCGCTACT 
TTTATGAATT CATTATCCAA AGGAAAAAAA CAATTATTGT 

TTCCTAGACA AAAAAAGTAT GGTGAACATG TAAATGATCA TCAAGTAGr^G TTTGTAAGAA GAATTTTACA 
AGATAATAAT ATTTTATTTA TAGAAAATAT AGATGATTTG 

TTTGAAAAAA TTATTGAAGT TTCTAAGCAA ACTAACTTTA CATCAAATAA TAATTTTTTT TGTGAAAGAT 
TAAAACAAAT AGTTGAAAAA TTTAATGAGG ATCAAGAAAA 

TGAATAATAA AAAAGATGCA TATTTGATAA TGGCTTATCA TAATTTTTCT CAGATTTTAC TGGAGAGGGA 
TACAGATATT ATCATCTTCT CTCAGGAGAA TGCACACCAT 

TAGTTCCTTC AGAATACCTG TATAATTATT TTAAATATTC TCAGGATTTA TATGTTGAAT TTACAAAAGA 
TGAGCAAAAA TATAAAGAAA ATAGGATATA TGAACGAGTT 

AAATGTTACA GATTATTTCC TAATATATCA GAAAAAACTA TTGATAATGT ACTGTTTAGA ATTTTATTAA 
GAATGTATCG AGCTTTTGAA TACTATTTAC AAAGATTGTT 

GTTTATTGAT AGAATAAAAA ACATGGTCTA AGAATAAGAT TTGGTTCTAA TTGGGTTTCG CTTCCACATG 
ATTTTGTGGC AATTCTTTTA TCAAATGAAA ACGAAACAGC 

TTATTTATTT AAGTAATCTA AATGTCCAGA TGAACTATTT ATACAGACAA TTATAGAAAA ATATGAATTT 
TCAAATAGAT TATCTAAATA TGGAAATTTA AGATATATAA 

AGTGGAAAAA ATCAACATCT TCTCCTATTG TCTTTACAGA TGATTCTATT GATGAATTGC TAAATGCi\AG 
AAATTTAGGT TTTTTATTTG CTAGAAAGTT AAAAATAGAA 

AATAAATCTA AATTTAAAGA AATTATTACT AAAAAATAAA ATAGTTGATT TTGTGAGAGT AATGTATGTT 
TAAATTATTT AAATATGACC CGGAATATTT TATTTTTAAG 

TACTTCTGGT TGATTATTTT TATTCCAGAG CAAAAGTATG TATTTTTATT AATTTTTATG AATTTAATTT 
TATTTCATAT AAAATTTTTG AAAACTAAGC TAATATTAAA 

AAATGAAATT TTATTGTTTT TATTATGGTC TATATTATGT TTTGTTTCAG TAGTCACAAG TATGTTTGTT 
GAAATAAATT TTGAAAGATT ATTTGCAGAT TTTACTGCTC 

CCATAATTTG GATTATTGCA ATAATGTATT ATAATTTGTA TTCATTTATA AATATTGATT ATAAAAAATT 
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AAAAAATAGT ATCTTTTTTA GTTTTTTAGT TTTATTAGGT 

ATATCTGCAT TGTATATTAT TCAAAATGGG AAAGATATTG TATTTTTAGA CAGACACCTT ATAGGACTAG 
ACTATCTTAT AACAGGCGTC AAAACAAGGT TGGTTGGCTT 

TATGAACTAT CCTACGTTAA ATACCACTAC AATTATAGTT TCAATTCCGT TAATCTTTGC ACTTATAAAA 
AATAAAATGC AACAATTTTT TTTCTTGTGT CTTGCTTTTA 

TACCGATCTA TTTAAGTGGA TCGAGAATTG GTAGTTTATC GCTAGCAATA TTAATTATAT GCTTGTTATG 
GAGATATATA GGTGGAAAAT TTGCTTGGAT AAAAAAGCTA 

ATAGTAATAT TTGTAATACT ACTTATTATT TTAAATACTG AATTGCTTTA CCATGAAATT TTGGCTGTTT 
ATAATTCTAG AGAATCAAGT AACGAAGCTA GATTTATTAT 

TTATCAAGGA AGTATTGATA AAGTATTAGA AAACAATATT TTATTTGGAT ATGGAATATC CGAATATTCA 
GTTACGGGAA CTTGGCTCGG AAGTCATTCA GGCTATATAT 

CATTTTTTTA TAAATCAGGA ATAGTTGGGT TGATTTTACT GATGTTTTCT TTTTTTTATG TTATAAAAAA 
AAGTTATGGA GTTAATGGGG AAACAGCACT ATTTTATTTT 

ACATCATTAG CCATATTTTT CATATATGAA ACAATAGATC CGATTATTAT TATATTAGTA CTATTCTTTT 
CTTCAATAGG TATTTGGAAT AATATAAATT TTAAAAAGGA 

TATGGAGACA AAAAATGAAT GATTTAATTT CAGTTATTGT ACCAATTTAT AATGTCCAAG ATTATCTTGA 
TAAATGTATT AACAGTATTA TTAACCAAAC ATATACTAAT 

TTAGAGGTTA TTCTCGTAAA TGATGGAAGT ACTGATGATT CTGAGAAAAT TTGCTTAAAC TATATGAAGA 
ACGATGGAAG AATTAAATAT TACAAGAAAA TTAATGGCGG 

TCTAGCAGAT GCTCGAAATT TCGGACTAGA ACATGCAACA GGTAAATATA TTGCTTTTGT CGATTCTGAT 
GACTATATAG AAGTTGCAAT GTTCGAGAGA ATGCATGATA 

ATATAACTGA 'GTATAATGCC GATATAGCAG AGATAGATTT TTGTTTAGTA GACGAAAACG GGTATACAAA 
GAAAAAAAGA AATAGTAATT TTCATGTCTT AACGAGAGAA 

GAGACTGTAA AAGAATTTTT GTCAGGATCT AATATAGAAA ATAATGTTTG GTGCAAGCTT TATTCACGAG 
ATATTATAAA AGATATAAAA TTCCAAATTA ATAATAGAAG 

TATTGGTGAG GATTTGCTTT TTAATTTGGA GGTCTTGAAC AATGTAACAC GTGTAGTAGT TGATACTAGA 
GAATATTATT ATAATTATGT CATTCGTAAC AGTTCGCTTA 

TTAATCAGAA ATTCTCTATA AATAATATTG ATTTAGTCAC AAGATTGGAG AATTACCCCT TTAAGTTAAA 
AAGAGAGTTT AGTCATTATT TTGATGCAAA AGTTATTAAA 

GAGAAGGTTA AATGTTTAAA CAAAATGTAT TCAACAGATT GTTTGGATAA TGAGTTCTTG CCAATATTAG 
AGTCTTATCG AAAAGAAATA CGTAGATATC CATTTATTAA 

AGCGAAAAGA TATTTATCAA GAAAGCATTT AGTTACGTTG TATTTGATGA AATTTTCGCC TAAACTATAT 
GTAATGTTAT ATAAGAAATT TCA7VAAGCAG TAGAGGTAAA 

AATGGATAAA ATTAGTGTTA TTGTTCCAGT TTATAATGTA GATAAATATT TAAGTAGTTG TATAGAAAGC 
ATTATTAATC AAAATTATAA AAATATAGAA ATATTATTGA 

TAGATGATGG CTCTGTAGAT GATTCTGCTA AAATATGCAA GGAATATGCA GAAAAAGATA AAAGAGTAAA 
AATTTTTTTC ACTAATCATA GTGGAGTATC AAATGCTAGA 

AA.TCATGGAA TAAAGCGGAG TACAGCTGAA TATATTATGT TTGTTGACTC TGATGATGTT GTTGATAGTA 
GATTAGTAGA AAAATTATAT TTTAATATTA TAAAAAGTAG 

AAGTGATTTA TCTGGTTGTT TGTACGCTAC TTTTTCAGAA AATATAAAi'A ATTTTGAAGT GAATAATCCA 
AATATTGATT TTGAAGCAAT TAATACCGTG CAGGACATGG 

GAGAAAAAAA TTTTATGAAT TTGTATATAA ATAATATTTT TTCTACTCCT GTTTGTAAAC TATATAAGAA 
AAGATACATA ACAGATCTTT TTCAAGAGAA TCAATGGTTA 

GGAGAAGATT TACTTTTTAA TCTGCATTAT TTAAAGAATA TAGATAGAGT TAGTTATTTG ACTGAACATC 
TTTATTTTTA TAGGAGAGGT ATACTAAGTA CAGTAAATTC 

TTTTAAAGAA GGTGTGTTTT TGCAATTGGA AAATTTGCAA AAACAAGTGA TAGTATTGTT TAAGCAAATA 
TATGGTGAGG ATTTTGACGT ATCAATTGTT AAAGATACTA 

TACGTTGGCA AGTATTTTAT TATAGCTTAC TAATGTTTAA ATACGGAAAA CAGTCTATTT TTGACAAATT 
TTTAATTTTT AGAAATCTTT ATAA?^AAATA TTATTTTAAC 

TTGTTAAAAG TATCTAACAA AAATTCTTTG TCTAAAAATT TTTGTATAAG AATTGTTTCG AACAAAGTTT 
TTAAAAAAAT ATTATGGTTA TAATAGGAAG ATATCATGGA 

TACTATTAGT AAAATTTCTA TAATTGTACC TATATATAAT GTAGAAAAAT ATTTATCTAA ATGTATAGAT 
AGCATTGTAA ATCAGACCTA CAT^CATATA GAGATTCTTC 

TGGTGAATGA CGGTAGTACG GATAATTCGG AAGAAATTTG TTTAGCATAT GCGAAGAAAG ATAGTCGCAT 
TCGTTATTTT AAAAAAGAGA ACGGCGGGCT ATCAGATGCC 

CGTAATTATG GCATAAGTCG CGCCAAGGGT GACTACTTAG CTTTTATAGA CTCAGATGAT TTTATTCATT 
CGGAGTTCAT CCAACGTTTA CACGAAGCAA TTGAGAGAGA 

GAATGCCCTT GTGGCAGTTG CTGGTTATGA TAGGGTAGAT GCTTCGGGGC ATTTCTTAAC AGCAGAGCCG 
CTTCCTACAA ATCAGGCTGT TCTGAGCGGC AGGAATGTTT 

GTAAAAAGCT GCTAGAGGCG GATGGTCATC GCTTTGTGGT GGCCTGTAAT AAACTCTATA AAAAAGAACT 
ATTTGAAGAT TTTCGATTTG AAAAGGGTAA GATTCATGAA 
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GATGAATACT TCACTTATCG CTTGCTCTAT GAGTTAGAAA AAGTTGCAAT AGTTAAGGAG TGCTTGTACT 
ATTATGTTGA CCGAGAAAAT AGTATCACAA CTTCTAGCAT 

GACTGACCAT CGCTTCCATT GCCTACTGGA ATTTCAAAAT GAACGAATGG ACTTCTATGA AAGTAGAGGA 
GATAAAGAGC TCTTACTAGA GTGTTATCGT TCATTTTTAG 

CCTTTGCTGT TTTGTTTTTA GGCAAATATA ATCATTGGTT GAGCAAACAG CAAAAGAAGC TT 
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IE 



RQTKLALFDMIAVAISAILTSHIPNADLNRSGIFIIMMVHYFAFFISRMPVEFEYRGNLI 
EFEKTFNYSIIFAIFLTAVSFLLENNFALSRRGAVYFTLINFVLVYLFNVIIKQFKDSFL 
FSTIYQKKTILITTAERWENMQVLFESHKQIQKNLVALVVLGTEIDKINLSLPLYYSVEE 
AIEFSTREWDHVFINLPSEFLDVKQFVSDFELLGIDVSVDINSFGFTALBCNKKIQLLGD 
HSIVTFSTNFYKPSHIMMKRLLDILGAWGLIICGIVSILLVPIIRRDGGPAIFAQKRVG 
QNGRIFTFYKFRSMYVDAEERKKDLLSQNQMQGWVCFKMGKTILELLQLDISYAKTSLDE 
LPQFYNVLIGDMSLVGTRPPTVDEFEKYTPGQKRRLSFKPGITGLWQVSGRSNITDFDDV 
VRL DLA Y I DN WT I W S D I K I LLKT VKWLL REG S K 
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CPS IF 



MKVCLVGSSGGHLTHLYLLKPFWKEEERFWVTFDKEDARSLLKNEKMYPCYFPTORNLIN 
LVKNTFLAFKILRDEKPDVIISSGAAVAVPFFYIGKLFGAKTIYIEVFDRVNKSTLTGKL 
VYPVTDI FIVQWEEMKKVYPKS INLGS I F 
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CPSIG 



MIFVTVGTHEQQFNRIiIKEIDLLKKNGSITDEIFIQTGYSDYIPEYCKYKKFLSYKEMEQ 
YINKSEVVICHGGPATFMNSLSKGKKQLLFPRQKKYGEHVNDHQVEFVRRILQDNNILFI 
ENIDDLFEKIIEVSKQTNFTSNNNFFCERLKQIVEKFNEDQENE 
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CPSIH 



MFKLFKYDPEYFIFKYFWLIIFIPEQKYVFLLIFMNLILFHIKFLKTKLILKNEILLFLL 
WSILCFVSWTSMFVEINFERLFADFTAPIIWIIAIMYYNLYSFINIDYKKLKNSIFFSF 
LVLLGISALYIIQNGKDIVFLDRHLIGLDYLITGVKTRLVGFMNYPTLNTTTIIVSIPLI 
FALIKNKMQQFFFLCLAFIPIYLSGSRIGSLSPLAILIICLLWRYIGGKFAWIKKLIVIF 
VILLIILNTELLYHEILAVYNSRESSNEARFIIYQGSIDKVLENNILFGYGISEYSVTGT 
WLGSHSGYISFFYKSGIVGLILLMFSFFYVIKKSYGVNGETALFYFTSLAIFFIYETIDP 
IIIILVLFFSSIGIWNNINFKKDMETKNE 
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CPSII 



MNDLISVIVPIYNVQDYLDKCINSIINQTYTNLEVILVNDGSTDDSEKICLNYMKNDGRI 
ECYYKKINGGLADARNFGLEHATGKYIAFVDSDDYIEVAMFERMHDNITEYNADIAEIDFC 
LVDENGYTKKKRNSNFHVLTREETVKEFLSGSNIENNVWCKLYSRDIIKDIKFQINNRSI 
GEDLLFNLEVLNNVTRWVDTREYYYNYVIRNSSLINQKFSINNIDLVTRLENYPFKLKR 
EFSHYFDAKVIKEKVKCLNKMYSTDCLDNEFLPILESYRKEIRRYPFIKAKRYLSRKHLV 
TLYLMKFS PKLYVMLYKKFQKQ 
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CPSIF 



MDKISVIVPVYNVDKYLSSCIESIINQNYBCNIEILLIDDGSVDDSAKICKEYEKDKRVKI 
FFTNHSGVSNARNHGIKRSTAEYIMFVDSDDWDSRLVEKLYFNIIKSRSDLSGCLYATF 
SENINNFEVNNPNIDFEAINTVQDMGEKNFMNLXXNNIFSTPVCXLYQKRYITDLFQENQ 
WLGEDLLFNLHYLKNIDRVSYLTEHLYFYRRGILSTVNSFKEGVFLQLENLQKQVIVLFK 
QIYGEDFDVSIVKDTIRWQVFYYSLLMFKYGKQSIFDKFLIFRNLYKKYYFNLLBCVSNKN 
SLSKNFCIRIVSNKVFKKILWL 
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CPSIK 



MDTISKISIIVPIYNVEKYLSKCIDSIVNQTYKHIEILLVNDGSTDNSEEICLAYAKKDS 
RIRYFKKENGGLSDARNYGISRAKGDYLAFIDSDDFIHSEFIQRLHEAIERENALVAVAG 
YDRVDASGHFLTAEPLPTNQAVLSGRNVCKKLLEADGHRFWACNKLYKKELFEDFRFEK 
GKIHEDEYFTYRLLYELEBCVAIVKECLYYYVDRENSITTSSMTDHRFHCLLEFQNERMDF 
YESRGDKELLLECYRSFLAFAVLFLGKYNHWLSKQQKK 
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CPS9 

AAGCTTATCG TCAAGGTGTT CGCTATATCG TGGCGACATC TCATAGACGA AAAGGGATGT TTGAAACACC 
AGAAAAAGTT ATCATGACTA ACTTTCTTCA ATTTAAAGAC 

GCAGTAGCAG AAGTTTATCC TGAAATACGA TTGTGCTATG GTGCTGAATT GTATTATAGT AAAGATATAT 
TAAGCAAACT TGAAAAAAAG TVAAGTACCCA CACTTAATGG 

CTCGCGCTAT ATTCTTTTGG AGTTCAGTAG TGATACTCCT TGGAAAGAGA TTCAAGAAGC AGTGAACGAA 
GTGACGCTAC TTGGGCTAAC TCCCGTACTT GCCCATATAG 

AACGATATGA CGCCCTAGCG TTTCATGCAG AGAGAGTAGA AGAGTTAATT GACAAGGGAT GCTATACTCA 
GGTAAATAGT AATCATGTGC TGAAGCCCAC TTTAATTGGT 

GATCGAGCAA AAGAATTTAA AAAACGTACT CGGTATTTTT TAGAGCAGGA TTTAGTACAT TGTGTTGCTA 
GCGATATGCA TAATTTATCT AGTAGACCTC CGTTTATGAG 

GGAGGCTTAT AAGTTGCTAA CAGAGGAATT TGGCAAAGAT AAAGCGAAAG CGTTGCTAAA AAAGAATCCT 
CTTATGCTAT TAAAAAACCA GGCGATTTAA ACTGGTTACT 

CTAGATTGTG GAGAGAAAAA TGGATTTAGG AACTGTTACT GATAAACTGT TAGAACGCAA CAGTAAACGA 
TTGATACTCG TGTGCATGGA TACGTGTCTT CTTATAGTTT 

CCATGATTTT GAGCAGACTG TTTTTGGATG TTATTATTGA CATACCAGAT GAACGCTTCA TTCTTGCAGT 
TTTATTCGTA TCAATTTTAT ATTTGATTCT ATCGTTTAGA 

TTAAAAGTCT TTTCATTAAT TACGCGTTAC ACAGGGTATC AGAGTTATGT AAAAATAGGA CTTAGTTTAA 
TATCTGCGCA TTCATTGTTT TTAATTATCT CAATGGTGTT 

GTGGCAGGCT TTTAGTTATC GTTTCATCTT AGTATCCTTA TTTTTGTCGT ATGTAATGCT CATTACTCCG 
AGGATTGTTT GGAAAGTCTT ACATGAGACG AGAAAAAATG 

CTATCCGTAA GAAGGATAGC CCACTAAGAA TCTTAGTAGT AGGTGCTGGA GATGGTGGTA ATATTTTTAT 
CAATACTGTC .AAAGATCGAA AATTGAATTT TGAAATTGTC 

GGTATCGTTG ATCGTGATCC AAATAAACTT GGAACATTTA TCCGTACGGC TAAAGTTTTA GGAAACCGTA 
ATGATATTCC ACGACTGGTA GAGGAATTAG CTGTTGACCA 

AGTGACGATT GCCATCCCTT CTTTAAATGG TAAGGAGCGA GAGAAGATTG TTGAAATCTG TAACACTACA 
GGAGTGACCG TCAATAATAT GCCGAGTATT GAAGACATTA 

TGGCGGGGAA CATGTCTGTC AGTGCCTTTC AGGAAATTGA CGTAGCAGAC CTTCTTGGTC GACCAGAGGT 
TGTTTTGGAT CAGGATGAAT TGAATCAGTT TTTCCAAGGG 

AAAACAATCC TTGTCACAGG AGCAGGTGGC TCTATCGGTT CAGAGCTATG TCGTCAAATT GCTAAGTTTA 
CGCCTAAACG CTTGTTGTTG CTTGGACATG GAGAAAATTC 

AATCTATCTC ATTCATCGAG AGTTACTGGA AAAGTACCAA GGTAAGATTG AGTTGGTCCC TCTCATTGCA 
GATATTCAAG ATAGAGAATT GATTTTTAGC ATAATGGCTG 

AATATCAACC CGATGTTGTT TATCATGCTG CAGCACATAA GCATGTTCCT TTGATGGAAT ATAATCCACA 
TGAAGCAGTG AAGAATAATA TTTTTGGAAC GAAGAATGTG 

GCTGAGGCGG CTAAAACTGC AAAGGTTGCC AAATTTGTTA TGGTTTCAAC AGATAAAGCT GTTAATCCAC 
CAAATGTCAT GGGAGCGACT AAACGTGTTG CAGAAATGAT 

TGTTACAGGT TTAAACGAGC CAGGTCAGAC TCAATTTGCG GCAGTCCGGT TTGGGAATGT TCTAGGTAGT 
CGTGGAAGTG TTGTTCCGCT ATTCAAAGAG CAAATTAGAA 

AAGGTGGACC TGTTACGGTT ACCGACTTTA GGATGACTCG TTATTTCATG ACGATTCCTG AGGCAAGTCG 
TTTGGTTATC CAAGCTGGAC ATTTGGCAAA AGGTGGAGAA 

ATATTTGTCT TGGATATGGG CGAGCCAGTA CAAATCCTGG AATTGGCAAG AAAAGTTATC TTGTTAAGTG 
GACACACAGA GGAAGAAATC GGGATTGTAG AATCTGGAAT 

CAGACCAGGC GAGAAACTCT ACGAGGAATT ATTATCAACA GAAGAACGTG TCAGCGAACA GATTCATGAA 
AAAATATTTG TGGGTCGCGT TACAAATAAG CAGTCGGACA 

TTGTCAATTC ATTTATCAAT GGATTACTCC AAAAAGATAG AAATGAATTA AAAAATATGT TGATTGAATT 
TGCAAAACAA GAATAAGAAA GTAAAAAATA TTTTTACTTT 

CCTAGAGTTT AAACGATGTT TAAGTTCTAG GAAGGTTAGA ATACCTAATT AACAACAATA TTACTATTTA 
TTAAGAGTCA GATAATAGCA ACTAAGTGCT ACAAACTATC 

TTTATAATAA GTATATTTGG TCAAAAGGGA GATGTGAAAT GTATCCAATT TGTAAACGTA TTTTAGCAAT 
TATTATCTCA GGGATTGCTA TTGTTGTTCT GAGTCCAATT 

TTATTATTGA TTGCATTGGC AATTAAATTA GATTCTAAAG GTCCGGTATT ATTTAAACAA AAGCGGGTTG 
GTAAAAACAA GTCATACTTT ATGATTTATA AATTCCGTTC 

TATGTACGTT GACGCACCAA GTGATATGCC GACTCATCTA TTAAAGGATC CTAAGGCGAT GATTACCAAG 
GTGGGCGCGT TTCTCAGAAA AACAAGTTTA GATGAACTGC 

CACAGCTTTT TAATATTTTT AAAGGTGAAA TGGCGATTGT TGGTCCACGC CCAGCCTTAT GGAATCAATA 
TGACTTAATT GAAGAGCGAG ATAAATATGG TGCAAATGAT 

ATTCGTCCTG GACTAACCGG TTGGGCTCAA ATTAATGGTC GTGATGAATT GGAAATTGAT GAAAAGTCAA 
AATTAGATGG ATATTATGTT CAAAATATGA GTCTAGGTTT 

GGATATTAAA TGTTTCTTAG GTACATTCCT CAGTGTAGCC AGAAGCGAAG GTGTTGTTGA AGGTGGAACA 
GGGCAGAAAG GAAAAGGATG AAATTTTCAG TATTAATGTC 

GGTCTATGAG AAAGAAAAAC CAGAGTTTCT TAGGGAATCT TTGGAAAGCA TCCTTGTCAA TCAAACAATG 
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ATTCCAACGG 
ATCAGAGCTT 
AAAGAATTCG 
AAACATTGTA 
AAAAGCAAGT 
TATTGAGATA 
GATGAAATAT 
GCCACATGAC 
AGAAGATTAT 
TCGAAATTTG 
GGAACAGAGA 
ATTTATGTTA 
GTTTATATGC 
AAATCTTAAG 
AGTTAGATTC 
AGTTATTATT 
TCTTTGGATT 
AGGGGCATTA 
AGATGATATT 
CCAATCTTTG 
TTATCAGTAG 
CTCTCTAGA 



AGGTTGTCTT 
ATATAGTATT 
GGTTTAGGAA 
ATTATGAGTG 
TAACTTTATA 
GATGAGTTCT 
TAAAGATGGC 
TGTAATGTTT 
TTCCTTTGGG 
CAAACATTGA 
ACAAATTAAC 
GCTCAAGGAA 
CAACTTGGAT 
GAAATAGTAT 
AATTCGAAAT 
TGGGATGATT 
CATGGGTTGT 
TCAAACATTT 
TGGGACTGTC 
ACAGAGAAAA 
CCCAGATACT 



GGTAGAGGAT 
TTAGAAGAAT 
TTGCACTGAA 
GGTTTGCACG 
AAACAAAACC 
TAAATTCTAC 
i\AGGCGGGAG 
AAAAAGAAAA 
TGCGCATGAT 
TGAAACACTA 
AGTTGGACAT 
TTGTTACACC 
AAAGAAACTC 
GATTACAGTA 
CAAAGTGTAT 
GCTCGACAGA 
CTCTCAAAAT 
ATAAATTTGA 
ATAAAATTGA 
TGTATCAATG 
TCGGATAGAA 



GGGCCACTCA 
TTAAAAGTCG 
TGAAGGTTTG 
AAATGGATTC 
CGACTATAGA 
TAGTGAAATA 
AAATCCATGT 
GTGTCGAGAG 
TGCTTCAGGA 
GTTCTTGCAC 
TACTAATTGA 
ACTAGATGTA 
ATTTATGGAA 
TTGATGGCTA 
CAGCAGACAA 
TGATACAATA 
AAATCTAATC 
CAAAGTTAGT 
GACAATGCTT 
GTGTTTTGCA 
TCAATACGTA 



ATTTTCATTT 
TGATGATGTT 
GTTTCTCATA 
AGCAGGGGGG 
GTGTTGGAAA 
TTTATTAATC 
CATATAATGG 
AAAATAATAA 
TCAGGAAGGA 
AATCCAGATT 



TTTAAAACGA 
GCATATACAT 
AAAATGTTCC 
TATCAAACAC 
TGGGATGTTC 
AAATTTACAT 
AAGCCCATTT 
AAGATTATAT 
ATAGTCTTTT 
GATTGATGAA 



TAGCCTTGGA 
ACACGTTTTG 
AACCCAGCAC 
TTCCGTACGT 
AATAGGAGGG 
TAGGGTCTTT 
ATAATAAAAC 
AAAAAAATAT 
TTTCAGATCA 
AACGGAAATA 



Fig 5 cont. 
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CPS9D 



AYRQGVRYIVATSHRRKGMFETPEKVIMTNFLQFKDAVAEVYPEIRLCYGAELYYSKDIL 
SKLEKKKVPTLNGSRYILLEFSSDTPWKEIQEAVNEVTLLGLTPVLAHIERYDALAFHAE 
RVEELIDKGCYTQVNSNHVLKPTLIGDRAKEFKKRTRYFLEQDLVHCVASDMHNLSSRPP 
FMREAYKLLTEEFGKDKAKALLKKNPLMLLKNQAI . 



Fig. 5 cont« 
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CPS9E 



MDLGTVTDKLLERNSKRLILVCMDTCLLIVSMILSRLFLDVIIDIPDERFILAVLFVSIL 
YLILSFRLBCVFSLITRYTGYQSYVKIGLSLISAHSLFLIISMVLWQAFSYRFItiVSLFLS 
YVMLITPRIVWKVLHETRKNAIRKKDSPLRILWGAGDGGNIFINTVKDRKLNFEIVGIV 
DRDPNKLGTFIRTAKVLGNRNDIPRLVEELAVDQVTIAIPSLNGKEREKIVEICNTTGVT 
VNNMPSIEDIMAGNMSVSAFQEIDVADLLGRPEVVLDQDELNQFFQGKTILVTGAGGSIG 
SELCRQIAKFTPKRLLLLGHGENSIYLIHRELLEKYQGKIELVPLIADIQDRELIFSIMA 
EYQPDWYHAAAHKHVPLMEYNPHEAVKNNIFGTKNVAEAAKTAKVAKFVMVSTDKAVNP 
PNVMGATKRVAEMIVTGLNEPGQTQFAAVRFGNVLGSRGSWPLFKEQIRKGGPVTVTDF 
RMTRYFMTIPEASRLVIQAGHLAKGGEIFVLDMGEPVQILELARKVILLSGHTEEEIGIV 
ESGIRPGEKLYEELLSTEERVSEQI HEKI FVGRVTNKQSDI VNSFINGLLQKDRNELKNM 
LIEFAKQE 



Fig. 5 cont. 
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CPS9F 



MYPICKRILAIIISGIAIVVLSPILLLIALAIKLDSKGPVLFKQKRVGKNKSYFMIYKFR 
SMYVDAPSDMPTHLIiKDPKAMITKVGAFLRKTSLDELPQLFNIFKGEMAIVGPRPALWNQ 
YDLIEERDKYGANDIRPGLTGWAQINGRDELEIDEKSKLDGYYVQNMSLGLDIKCFLGTF 
LSVARSEGWEGGTGQKGKG 



Fig. 5 cont. 
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CPS9G 



MKFSVLMSVYEKEKPEFLRESLES I LVNQTMI PTEWLVEDGPLNQSLYS ILEEFKSRFS 
FFKTIALEKNSGLGIALNEGLKHCNYEWVCTKWILMMLHIHTRFEKQVNFIKQNPTIDIE 
IDEFLNSTSEIVSHKNVPTQHDEILKMARREKSMCHMTVMFKKKSVERAGGYQTLPYVED 
YFLWVRMIASGSKFANIDETLVLARVGNGMFNRRGNREQINSWTLLIEFMLAQGIVTPLD 
VFINQIYIRVFVYMPTWIKKLIYGKILRK 



Fig* 5 cont. 
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CPS9H 



MITVLMATYNGSPFIIKQLDSIRNQSVSADKVIIWDDCSTDDTIKIIKDYIKKYSLDSWV 
VSQNKSNQGHYQTFINLTKLVQEGIVFFSDQDDIWDCHKIETMLPIFDRENVSMVFCKSR 
LIDENGNIISSPDTSDRINTYSL 



Fig. 5 cont. 



